Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ImageFolder dataloader #600

Draft
wants to merge 10 commits into
base: master
Choose a base branch
from
Draft

Conversation

Reep236
Copy link
Contributor

@Reep236 Reep236 commented Aug 2, 2021

Typed and Untyped ImageFolder dataloaders. Very similar to MNIST for now, just labels based on subdirectories and image tensors. No transforms or other features from torchvision.datastets.ImageFolder

Aside from that,

  1. Various image processing functions were added to Torch.Typed.Vision, many of these are marginally nicer than using withTensor and toDynamic, given there is very little that can actually be done with loading, writing at least verifies DType and shape.
  2. Untyped functional received the version of UpsampleNearest2d used in the typed counterpart as a prime variant.
  3. Permute was added to typed functional.
  4. A version of stack which accepts a standard list of tensors was added to typed functional as a prime variant.
  5. Typed Upsample2D was switched from NCWH to NCHW
  6. Interpolate and Resize functions were added. These were originally created as part of the reverse engineering process from PyTorch but eventually became useful as they accept full input and output shapes, not just H and W.
  7. contiguous, isContiguous, and withTensorPtr were added for typed tensors. Once again, this mainly just avoids a use of toDynamic.
  8. Spec added for ImageFolder in VisionSpec, comparing MNIST asTensor with a saved MNIST image.
  9. DocTests added for typed stack' and permute

I would greatly appreciate any feedback. Unlike with convolutions, I didn't have much of a skeleton for a lot of this, so I wouldn't be surprised if I broke some conventions unknowingly.

@tscholak
Copy link
Member

tscholak commented Aug 2, 2021

Excellent work! I hope we can bring this to the gradually typed API as well 😊

[Tensor device dtype preShape] ->
-- | output
Tensor device dtype shape
stack' tensors = case someNatVal (fromIntegral $ length tensors) of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I'm missing something here, but it seems like that the output shape is not at all computed from the input shape, is it? So, it's up to the user to know exactly how many elements the list has and do the shape computation by hand?
I understand that there can be desire to do this, but this is a very unsafe thing to do. A better way would be to use vecStack from below. You can convert your list first to a sized vector -- which can fail https://hackage.haskell.org/package/vector-sized-1.4.4/docs/Data-Vector-Generic-Sized.html#v:fromList -- and then use vecStack on it safely.

Tensor device 'D.Float shape ->
Tensor device 'D.Float newShape
interpolate2d alignCorners tensor
| symbolVal (Proxy @mode) == "nearest" = unsafePerformIO $ (ATen.cast2 ATen.Managed.upsample_nearest2d_tl) tensor ([h, w] :: [Int])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you forget to compute newShape here? It seems it's completely undetermined.

Can I also ask you to not use a Symbol for case splitting here? There are two alternatives that I find more appropriate:

  • split the function into two and name them appropriately.
  • use a sum type, e.g. Interpolate2dMode = Interpolate2dNearest | Interpolate2dBilinear and add it to the arguments.

can you add a sum type for alignCorners? e.g. Interpolate2dAlignCorners = Interpolate2dWithAlignedCorners | Interpolate2dWithoutAlignedCorners

Bool ->
Tensor device dtype shape ->
Tensor device dtype newShape
resize2d alignCorners = toDType @dtype @'D.Float . interpolate2d @newShape @mode @w @h alignCorners . toDType @'D.Float @dtype
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comments as for interpolate2d. I think that you may have intended interpolate2d as an internal method? If so, please add it in a where clause within the scope of resize2d.

indexes = [ix * n .. (ix+1) * n - 1]
(imgs, labels) = case someShape indexes of
SomeShape (Proxy :: Proxy idxs) ->
(getFolderImages @idxs folderData,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this type checks. SomeShape (Proxy :: Proxy idxs) is an existentially quantified shape without constraints. Yet you are using getFolderImages @idxs with the '[n, 3, w, h] ~ idxs constraint here 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shape is still derived by GHC from the instance declaration and type signature of getItem, I believe. I had all the type applications in place originally which would've made this more clear, but looked to clean it up before committing. idxs represent the actual indicies of the image folder in this case, not the output shape of the tensor.

That's one thing that's probably a holdover from bodging code for myself together. I treat SomeShape and shapeVal as a general SomeNatList and natListVal since it will function fine as such. It's the same permute, even though the permute dims aren't exactly a shape, but an ordering of dimensions, I still use KnownShape and shapeVal. Not great practice in an actual repository, I'll fix it for clarity.

FolderData ->
Tensor device dtype shape
getFolderImages folder = squeezeDim @0
. stack' @1 @'[1, 3, w, h] @(1 ': shape)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I see now why you want stack'. Alternative suggestion:
use vecStack . fromJust . Data.Vector.Sized.fromList here. at least it's explicit this way that this can crash.

getFolderImages folder = squeezeDim @0
. stack' @1 @'[1, 3, w, h] @(1 ': shape)
. concat
$ zipWith3 (\path names idxs -> map (unsafePerformIO . getFolderImage @'[1, 3, w, h] path names) idxs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't use unsafePerformIO when you're doing file IO.

SomeShape2d

someShape2d :: [Int] -> SomeShape2d
someShape2d [] = error "Invalid"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should probably return a Maybe SomeShape2d

whc = w * h * c * dtype_size
F.withForeignPtr fptr $ \ptr2 -> do
BSI.memcpy (F.castPtr ptr1) (F.castPtr ptr2) whc
print t
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove

for(int i=0;i<$(int whc);i++){
dst[i] = src[i];
}
} |]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@junjihashimoto can you review this wrt to memory safety? I'm not sure this is garbage collected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although there are no allocations here, so probably it's fine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, @dtype' should be D.Int64, because the pointer of dst is int64_t's array.

This is the same as Torch.Vision's one.
If you can merge createTensorU32to64 of both, try it.
https://github.com/hasktorch/hasktorch/blob/master/hasktorch/src/Torch/Vision.hs#L563-L578

Copy link
Member

@tscholak tscholak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a closer look 👀! Overall it looks good and I'm very excited to merge this in, but I've got a few suggestions for improvements.

Right tensor -> return tensor


getFolderImages ::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what PyTorch does here, but I find it odd that you are reading in a hole folder of images as a single Tensor. Won't this be huge in practice?


-- ImageFolder dataset

newtype ImageFolder (m :: Type -> Type) (device :: (D.DeviceType, Nat) ) (finalShape :: [Nat]) = ImageFolder { folderData :: FolderData }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the assumption that all images in an image folder are of the same kind/format?
If so, then I think your code can be drastically simplified by adding a type variable, say, imageFormat :: ImageFormat, to the ImageFolder data type. That would make it possible to define individual type class instances for each expected ImageFormat kind. You wouldn't need to use existentials.

@Reep236
Copy link
Contributor Author

Reep236 commented Aug 3, 2021

Thanks so much for reviewing all of this! I'll get right to fixing it. A lot of these I already had an intuition for but brushed off to stick to existing skeletons. I guess I was a bit timid and it came back to bite me. At least now I can check these off this list and be a bit more refined next PR.

@Reep236
Copy link
Contributor Author

Reep236 commented Aug 11, 2021

Alright, I have something better somewhat working, combining the advice given to me by @tscholak and @junjihashimoto . However, it's currently a bit of a mess and incomplete; generalizing for image and pixel formats led to overhauling more than expected. I ran into some hiccups along the way and am going to be away for a few days, so I'm going to move this to draft for now. I'm certainly still working away at it, and I think it will come back with a lot more utility and general improvements to the image loading system (beneath and within the dataloader itself).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants