Sometimes we will need to generate a TFrecord file for its many advantages in terms of less space and higher reading speed. but how on earth can we make a TFrecord?
To make a TFrecord file, follow the following instrucions:
def make_tfrecord(dest_path, image_folder, label_csv): print ("There are {} images in the folder").format(len(os.listdir(image_folder))) l = 2 writer = tf.python_io.TFRecordWriter(dest_path) for img in os.listdir(image_folder): img_path = image_folder + img # print (img_path) img = Image.open(img_path) #convert image to bytes img_binary = img.tobytes() data = tf.train.Example(features=tf.train.Features(feature={ 'image': tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_binary])), "label": tf.train.Feature(float_list=tf.train.FloatList(value=[float(linecache.getline(label_csv, l).split(',')[1])])) })) writer.write(data.SerializeToString()) l += 1 if (l-1) %10 == 0: if float(l-1)/len(os.listdir(image_folder)) == 1: print ("dataset generation finished !") else: print ("{} percent finished ...").format(float(l-1)/len(os.listdir(image_folder))) writer.close()#
After generating a tfrecord , you will need a batch of examples with which you want to feed into your network.
First for simplicity purpose lets define a parser function that can parse an example from a tfrecord file.
def read_and_decode(file_name): #file_name can be a array of names in the format of [file1, file2] #as sometimes you have multiple tfrecord file filename_queue = tf.train.string_input_producer(file_name) reader = tf.TFRecordReader() _, serialized_example = reader.read(filename_queue) features = tf.parse_single_example(serialized_example, features={ 'label':tf.FixedLenFeature([], tf.int64), 'image':tf.FixedLenFeature([], tf.string), }) image = tf.decode_raw(features['image'], tf.uint8)
#because it is a gray-scale image, if the image was in RGB format
#we should use [480, 752, 3] instead image = tf.reshape(image, [480, 752]) image = tf.cast(image, tf.float32) label = tf.cast(features['label'], tf.int32)
print (image)
print (label)
# print (np.shape(image)) return image, label
the results of these two output functions are:
Tensor("Cast:0", shape=(224, 224, 3), dtype=float32) Tensor("Cast_1:0", shape=(), dtype=int32)
but what if we want to actually see the image? we can test if our tfrecord file was successfully generated if the image matches.
in the last step we got a parsed image and label, to visualize the image, we need to use the following method:
with tf.Sess() as sess: img, L = sess.run([image, label]) img = Image.fromarray(np.asarray(img)) #if the image is in RGB format #img = Image.fromarray(np.asarray(img), mode='RGB') #save the image to where you want img.save('/home/'+str(i)+'_''Label_'+str(l)+'.jpg')
the images would be saved if you follow the instructions:
to generate a batch of examples we will need to use the following command:
example_batch, label_batch = tf.train.shuffle_batch( [image, label], batch_size=32, capacity=1000+64, min_after_dequeue=1000) print (example_batch) print (label_batch)
the results of these two output functions are:
Tensor("shuffle_batch:0", shape=(32, 224, 224, 3), dtype=float32) Tensor("shuffle_batch:1", shape=(32,), dtype=int32)
sometimes we dont want to shuffle the example we can use:
example_batch, label_batch = tf.train.batch(
[image, label], batch_size=32, capacity=1000+64)
now we can feed batches to a predefined model.
本站文章如无特殊说明,均为本站原创,如若转载,请注明出处:Tensorflow : Sumary on TFrecord 如何制作,使用,测试以及显示TFrecord - Python技术站