Batch size changes output with same images #238

RorryB · 2020-05-07T21:35:27Z

Bug report

Information

Please specify the following information when submitting an issue:

What are your command line arguments?:
Command line args:
CUDA_VISIBLE_DEVICES=0 python -m pdb train.py --num_epochs 301 --continue_training false --dataset dataset --crop_height 352 --crop_width 480 --batch_size 4 --num_val_images 100 --model DeepLabV3_plus --frontend ResNet50
Have you written any custom code?:
I removed data augmentation by adding "return input_image, output_image" right at the beginning and removing an empty line to not change where other lines are later for breakpoints. I also tried both with is_training=False and is_training=True.
What have you done to try and solve this issue?:
Googled why this might happen. Tried other models.
TensorFlow version?:
'1.13.1'

Describe the problem

When calling sess.run the output will be different with the same images depending on the size of the batch they were included in.

Source code / logs

Running in pdb, this can be done with a fresh checkout to replicate the problem. I originally found it when trying to implement batch inference into predict.py but I doing this in train.py is the quickest way for you to reproduce the problem.
(Pdb) break train.py:197
...
(Pdb) output_image_last = sess.run(network,feed_dict={net_input:np.expand_dims(input_image, axis=0)})
(Pdb) output_images = sess.run(network,feed_dict={net_input:input_image_batch})
(Pdb) (input_image - input_image_batch[3]).max()
0.0
(Pdb) (output_image_last - output_images[3]).max()
1.0644385

The following is another set of commands I tested from the breakpoint at 197 if you want to copy paste quickly, for these you must remove data augmentation. These commands setup a batch within pdb of size 2 and 4 and just tries to generally test that the same input images produce different outputs depending on batch size.

output_image_last_alone = sess.run(network,feed_dict={net_input:np.expand_dims(input_image, axis=0)})
output_images_orig4 = sess.run(network,feed_dict={net_input:input_image_batch})

input_image_batch_manual2 = []

index = i * args.batch_size + j-1
id = id_list[index]
input_image2 = utils.load_image(train_input_names[id])
output_image2 = utils.load_image(train_output_names[id])

index = i * args.batch_size + j
id = id_list[index]
input_image3 = utils.load_image(train_input_names[id])
output_image3 = utils.load_image(train_output_names[id])
input_image2, output_image2 = data_augmentation(input_image2, output_image2)
input_image3, output_image3 = data_augmentation(input_image3, output_image3)
input_image2 = np.float32(input_image2) / 255.0
input_image3 = np.float32(input_image3) / 255.0
input_image_batch_manual2.append(np.expand_dims(input_image2, axis=0))
input_image_batch_manual2.append(np.expand_dims(input_image3, axis=0))
input_image_batch_manual2 = np.squeeze(np.stack(input_image_batch_manual2, axis=1))
output_images_batch2 = sess.run(network,feed_dict={net_input:input_image_batch_manual2})

input_image_batch_manual4 = []
index = i * args.batch_size + j-3
id = id_list[index]
input_image0 = utils.load_image(train_input_names[id])
output_image0 = utils.load_image(train_output_names[id])

index = i * args.batch_size + j-2
id = id_list[index]
input_image1 = utils.load_image(train_input_names[id])
output_image1 = utils.load_image(train_output_names[id])
input_image0, output_image0 = data_augmentation(input_image0, output_image0)
input_image1, output_image1 = data_augmentation(input_image1, output_image1)
input_image0 = np.float32(input_image0) / 255.0
input_image1 = np.float32(input_image1) / 255.0
input_image_batch_manual4.append(np.expand_dims(input_image0, axis=0))
input_image_batch_manual4.append(np.expand_dims(input_image1, axis=0))
index = i * args.batch_size + j-1
id = id_list[index]
input_image2 = utils.load_image(train_input_names[id])
output_image2 = utils.load_image(train_output_names[id])

index = i * args.batch_size + j
id = id_list[index]
input_image3 = utils.load_image(train_input_names[id])
output_image3 = utils.load_image(train_output_names[id])
input_image2, output_image2 = data_augmentation(input_image2, output_image2)
input_image3, output_image3 = data_augmentation(input_image3, output_image3)
input_image2 = np.float32(input_image2) / 255.0
input_image3 = np.float32(input_image3) / 255.0
input_image_batch_manual4.append(np.expand_dims(input_image2, axis=0))
input_image_batch_manual4.append(np.expand_dims(input_image3, axis=0))
input_image_batch_manual4 = np.squeeze(np.stack(input_image_batch_manual4, axis=1))
output_images_batch4 = sess.run(network,feed_dict={net_input:input_image_batch_manual4})

(input_image - input_image_batch[3]).max() #input image is the 4th image in the batch
(input_image - input_image_batch_manual2[1]).max() #input image is the 2nd image in this manually loaded batch loaded in pdb
(input_image - input_image_batch_manual4[3]).max() #input image is the 4th image in this manually loaded batch loaded in pdb

(output_image_last_alone - output_images_orig4[3]).max() #the single batch run produces a different output
(output_image_last_alone - output_images_batch2[1]).max() #the single batch run produces a different output
(output_image_last_alone - output_images_batch4[3]).max() #the single batch run produces a different output
(output_images_batch2[1] - output_images_batch4[3]).max() #batch size 2 produces different output than batch size 4

(output_images_orig4 - output_images_batch4).max() #the manually loaded batch produces the same output as the original batch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch size changes output with same images #238

Batch size changes output with same images #238

RorryB commented May 7, 2020

Batch size changes output with same images #238

Batch size changes output with same images #238

Comments

RorryB commented May 7, 2020

Information

Please specify the following information when submitting an issue:

Describe the problem

Source code / logs