Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

random crop function #5258

Open
1 task done
DAVID-Hown opened this issue Dec 28, 2023 · 10 comments
Open
1 task done

random crop function #5258

DAVID-Hown opened this issue Dec 28, 2023 · 10 comments
Assignees
Labels
question Further information is requested

Comments

@DAVID-Hown
Copy link

DAVID-Hown commented Dec 28, 2023

Describe the question.

eii = ExternalInputIterator(batch_size) pipe = Pipeline(batch_size=batch_size, num_threads=2, device_id=3) with pipe: jpegs = fn.external_source(source=eii, num_outputs=1, dtype=types.UINT8) decode = fn.decoders.image(jpegs, device="mixed", output_type=types.RGB) pipe.set_outputs(crop_patch)

how to add random crop to 224x224, between decode and output

Check for duplicates

  • I have searched the open bugs/issues and have found no duplicates for this bug report
@DAVID-Hown DAVID-Hown added the question Further information is requested label Dec 28, 2023
@JanuszL
Copy link
Contributor

JanuszL commented Dec 28, 2023

Hi @DAVID-Hown,

Thank you for reaching out.
You can just:

crop_patch= fn.crop_mirror_normalize(decode,
                                     crop_h=224,
                                     crop_w=224,
                                     crop_pos_x=fn.random.uniform(range=(0., 1.)),
                                     crop_pos_y=fn.random.uniform(range=(0., 1.)))

@JanuszL JanuszL self-assigned this Dec 28, 2023
@DAVID-Hown
Copy link
Author

DAVID-Hown commented Dec 28, 2023

@JanuszL Thank you, But I get a error TypeError: Invalid shape (3, 224, 224) for image data

img = batch_cpu.at(0)
print(img.shape)  # (3, 224, 224)
plt.imshow(img)
pylab.show()

maybe format is not right. this is my code:

import types
import collections
import numpy as np
from random import shuffle
from nvidia.dali.pipeline import Pipeline
import nvidia.dali.fn as fn
import nvidia.dali.types as types
import matplotlib.pyplot as plt
import pylab

batch_size = 16


class ExternalInputIterator(object):
    def __init__(self, batch_size, image_path):
        self.image_path = image_path
        self.batch_size = batch_size

    def __iter__(self):
        self.i = 0
        return self

    def __next__(self):
        batch = []
        for _ in range(self.batch_size):
            f = open(self.image_path, 'rb')
            batch.append(np.frombuffer(f.read(), dtype=np.uint8))
            self.i = (self.i + 1) % self.batch_size
        return (batch,)


img_path = "P22040562823834_mark_0.jpg"
eii = ExternalInputIterator(batch_size, img_path)
pipe = Pipeline(batch_size=batch_size, num_threads=2, device_id=3)
with pipe:
    jpegs = fn.external_source(source=eii, num_outputs=1, dtype=types.UINT8)
    decode = fn.decoders.image(jpegs, device="mixed", output_type=types.RGB)
    crop_patch = fn.crop_mirror_normalize(decode,
                                          crop_h=224,
                                          crop_w=224,
                                          crop_pos_x=fn.random.uniform(range=(0., 1.)),
                                          crop_pos_y=fn.random.uniform(range=(0., 1.)))
    pipe.set_outputs(crop_patch)

pipe.build()
pipe_out = pipe.run()
batch_cpu = pipe_out[0].as_cpu()

img = batch_cpu.at(0)

I plan to input an image, then use DALI library to accelerate IO speed, and directly output batch_size=16 tensor, do you have any suggestions

@JanuszL
Copy link
Contributor

JanuszL commented Dec 28, 2023

If you want to show it you need to change layout to HWC. You can do it:

crop_patch= fn.crop_mirror_normalize(decode,
                                     crop_h=224,
                                     crop_w=224,
                                     crop_pos_x=fn.random.uniform(range=(0., 1.)),
                                     crop_pos_y=fn.random.uniform(range=(0., 1.)),
                                     output_layout="HWC")

@DAVID-Hown
Copy link
Author

DAVID-Hown commented Dec 28, 2023

@JanuszL
My code is the one above; In the model inference phase, I want to input an image and output a result; In order to ensure the accuracy of the model, I need to input a picture, cut 16 patches of 224x224 size at random, and then feed into the model, the model outputs the score of each patch, and finally take the average of all the results。I am not familiar with using this library, so I would like to ask you

@DAVID-Hown
Copy link
Author

DAVID-Hown commented Dec 28, 2023

@JanuszL hi,batch size=16, obtain 16 dimensions tensorgpulist, how to stack into 16X224X224X3 pytorch supported dimensions

@JanuszL
Copy link
Contributor

JanuszL commented Dec 28, 2023

Hi,

You can call crop/slice operator 16 times and stack the results using cat/stack operators.

@DAVID-Hown
Copy link
Author

DAVID-Hown commented Dec 28, 2023

@JanuszLHow to convert pipe_out directly to 16X224X224X3 parallelly
image

@JanuszL
Copy link
Contributor

JanuszL commented Dec 28, 2023

How about:

decode = fn.decoders.image(jpegs, device="mixed", output_type=types.RGB)
  crop_patch = []
  for _ in range(16):
    crop_patch.append(fn.crop_mirror_normalize(decode,
                                        crop_h=224,
                                        crop_w=224,
                                        crop_pos_x=fn.random.uniform(range=(0., 1.)),
                                        crop_pos_y=fn.random.uniform(range=(0., 1.)),
                                        output_layout="HWC"))
    crop_patchs = fn.stack(*crop_patch)
  pipe.set_outputs(crop_patchs)

@DAVID-Hown
Copy link
Author

DAVID-Hown commented Dec 28, 2023

@JanuszL
image

crop_patch = fn.crop_mirror_normalize(decode,
                                          device="gpu",
                                          crop_h=224, crop_w=224,
                                          crop_pos_x=fn.random.uniform(range=(0., 1.)),
                                          crop_pos_y=fn.random.uniform(range=(0., 1.)),
                                          dtype=types.FLOAT,
                                          output_layout="CHW",
                                          mean=[0.485, 0.456, 0.406],
                                          std=[0.229, 0.224, 0.225])

I found that the function did not normalize my input

@JanuszL
Copy link
Contributor

JanuszL commented Dec 28, 2023

If you want to normalize it to 0-1 range please:

        mean=[0.485 * 255, 0.456 * 255, 0.406 * 255],
        std=[0.229 * 255, 0.224 * 255, 0.225 * 255],

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants