Converting images to numpy files

To save time during training, it can be useful to convert a dataset of images to numpy arrays, pre-process them (scaling, normalization, etc) and then save as one or more binary numpy files.

Each image should be read with the OpenCV imread function which converts an image file (JPEG, PNG, etc) into a numpy array. We will also reorder the color planes to RGB from the default BGR used by OpenCV:

# open image to numpy array and switch to RGB from BGR
img = cv2.imread(os.path.join(image_dir,image_name))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

Then it can be pre-processed as required (resizing, normalization, means subtraction, etc) before being saved into a numpy (.npy) file:

np.save('image.npy', img)

During training, the numpy file is then loaded like this:

# train_image is a numpy array
train_image = np.load('image.npy')

Obviously, converting each single image into a numpy file is not a viable solution for a large dataset, especially as an image saved into numpy file format is decidedly larger that its compressed JPEG equivalent. We need to ‘push’ as many images as possible into one single numpy file.

We need to first create a placeholder array which will hold all of our images. Next, we create a loop that will run through all of the images in a folder, pre-process them then insert them into the ‘placeholder’ array. The placeholder array is then saved to a numpy file.

When we need to use this in training, just load it and then index into the resulting numpy array:

# np.load returns a numpy array
x_train = np.load(‘dataset.npy’)

# fetch batches from training dataset
for i in range(num_of_batches):
        x_batch = x_train[i*batchsize:(i+1)*batchsize]

With np.save, we can only write one array into a numpy file, but the np.savez function allows us to pack multiple arrays into a single file and this can be very useful for placing both the training data and labels into one single file:

# placeholder arrays for data and labels
# data is float32, labels are integers
x = np.ndarray(shape=(len(imageList),height,width,channels), dtype=np.float32)
y = np.ndarray(shape=(len(imageList)), dtype=np.int32)

# loop through all images
for i in range(len(imageList)):
      # open image to numpy array
      img = cv2.imread(imageList[i])

      # do all the pre-processing…
      img = pre_process(img)

      # insert into placeholder array
      x[i] = img
      y[i] = label

# write placeholder arrays into a binary npz file
np.savez('dataset.npz', x=x, y=y)

For .npz files, the np.load function does not directly return numpy arrays, we need to unpack them like this:

train_f = np.load('dataset.npz')
x_train = train_f['x']
y_train = train_f['y']

..and then use the resulting numpy arrays (x_train and y_train) as indicated before. The numpy files can also be saved in compressed format using the np.savez_compressed function.

np.savez_compressed('dataset.npz', x=x, y=y)

They need to be unpacked in the same way as the non-compressed .npz files.

The img_to_npy.py script in this repository shows how to read a list of files and labels (..actually a shortened version of the ImageNet validation dataset index file..), pre-process the images by resizing, center-cropping and then normalizing to range 0 to 1.0. The labels can be one-hot encoded if required. The output file contains both data and labels.

The complete list of command line arguments of img_to_npy.py are as follows:

Argument	Default	Description
--image_dir	image_dir	Path to folder containing images
--label_file	val.txt	Path to text file that matches labels to images
--classes	1000	Total number of classes
--resize	False	Resize & center-crop all images to input_height x input_width
--normalize	False	Normalize pixels to range 0.0 to 1.0
--one_hot	False	One-hot encode the labels if set
--compress	False	Compress the output file if set
--output_file	dataset.npz	Path to output file
--input_height	224	See note 2 below
--input_width	224	See note 2 below
--input_chans	3	Number of channels in input image

Notes:

If --normalize is specified, the x array will be of float32 type, otherwise uint8 is used.
If --resize is specified, the images will be resized and center cropped to input_height x input_width. If --resize is not specified, the images must all be of dimensions input_height x input_width before running the script.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
image_dir		image_dir
README.md		README.md
img_to_npy.py		img_to_npy.py
test.py		test.py
val.txt		val.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

image_dir

image_dir

README.md

README.md

img_to_npy.py

img_to_npy.py

test.py

test.py

val.txt

val.txt

Repository files navigation

Converting images to numpy files

About

Releases

Packages

Languages

foolmarks/images_to_npy

Folders and files

Latest commit

History

Repository files navigation

Converting images to numpy files

About

Topics

Resources

Stars

Watchers

Forks

Languages