Skip to content
This repository has been archived by the owner on Nov 21, 2023. It is now read-only.

About SCALE and MAX_SIZE #584

Closed
lilichu opened this issue Jul 23, 2018 · 3 comments
Closed

About SCALE and MAX_SIZE #584

lilichu opened this issue Jul 23, 2018 · 3 comments

Comments

@lilichu
Copy link

lilichu commented Jul 23, 2018

All images of my dataset is 720 * 1280 and 1280 * 720(height*weight), in yaml:

  SCALES: (800,)
  MAX_SIZE: 1333

Does it mean the image will be resize as 800 * 1280 and 1280 * 800?

@ppwwyyxx
Copy link
Contributor

No it will be resized to 750x1333. See

def prep_im_for_blob(im, pixel_means, target_size, max_size):
"""Prepare an image for use as a network input blob. Specially:
- Subtract per-channel pixel mean
- Convert to float32
- Rescale to each of the specified target size (capped at max_size)
Returns a list of transformed images, one for each target size. Also returns
the scale factors that were used to compute each returned image.
"""
im = im.astype(np.float32, copy=False)
im -= pixel_means
im_shape = im.shape
im_size_min = np.min(im_shape[0:2])
im_size_max = np.max(im_shape[0:2])
im_scale = float(target_size) / float(im_size_min)
# Prevent the biggest axis from being more than max_size
if np.round(im_scale * im_size_max) > max_size:
im_scale = float(max_size) / float(im_size_max)
im = cv2.resize(
im,
None,
None,
fx=im_scale,
fy=im_scale,
interpolation=cv2.INTER_LINEAR
)
return im, im_scale
for the implementation.

@ir413 ir413 closed this as completed Jul 23, 2018
@chelixuan
Copy link

Hi! The ratio of length to width of the picture in Ms COCO is different, if we re-scale the images that their shorter side is 800, we get the different ratio. In the end, the longer side of images is different. The size of the training images is different . But the input_image should be at the same size. How can I solve this problem? Thx~ @ppwwyyxx @

@ppwwyyxx
Copy link
Contributor

def im_list_to_blob(ims):
"""Convert a list of images into a network input. Assumes images were
prepared using prep_im_for_blob or equivalent: i.e.
- BGR channel order
- pixel means subtracted
- resized to the desired input size
- float32 numpy ndarray format
Output is a 4D HCHW tensor of the images concatenated along axis 0 with
shape.
"""
if not isinstance(ims, list):
ims = [ims]
max_shape = np.array([im.shape for im in ims]).max(axis=0)
# Pad the image so they can be divisible by a stride
if cfg.FPN.FPN_ON:
stride = float(cfg.FPN.COARSEST_STRIDE)
max_shape[0] = int(np.ceil(max_shape[0] / stride) * stride)
max_shape[1] = int(np.ceil(max_shape[1] / stride) * stride)
num_images = len(ims)
blob = np.zeros(
(num_images, max_shape[0], max_shape[1], 3), dtype=np.float32
)
for i in range(num_images):
im = ims[i]
blob[i, 0:im.shape[0], 0:im.shape[1], :] = im
# Move channels (axis 3) to axis 1
# Axis order will become: (batch elem, channel, height, width)
channel_swap = (0, 3, 1, 2)
blob = blob.transpose(channel_swap)
return blob

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants