Get a single batch from DataLoader without iterating #1917

narendasan · 2017-06-26T19:35:08Z

Is it possible to get a single batch from a DataLoader? Currently, I setup a for loop and return a batch manually.
If there isn't a way to do this with the DataLoader currently, I would be happy to work on adding the functionality.

colesbury · 2017-06-26T19:51:20Z

next(iter(data_loader)) ?

narendasan · 2017-06-26T20:05:30Z

Cool, thats a lot better than what I had been using.
Thanks!

hyperfraise · 2018-02-02T10:57:03Z

That answer provokes a memory leak in my training with linear augmentation of RAM memory, while constant occupation with a regular for loop (and exactly same code in the loop) :/

samarthbhargav · 2018-06-20T09:48:16Z

👍 to @hyperfraise. this creates a memory leak.

srossi93 · 2018-06-25T14:54:46Z

Same problem of memory leak with the following (slightly different) code:

dataloader_iterator = iter(dataloader)
for i in range(iterations):     
    try:
        X, Y = next(dataloader_iterator)
    except:
        dataloader_iterator = iter(train_loader)
        X, Y = next(dataloader_iterator)
    do_backprop(X, Y)

Memory occupation continuously increases during the for-loop. I might open a new issue with more information (if not done yet)

apaszke · 2018-06-25T16:00:35Z

This might not be a memory leak but simply the fact that your loop is extremely busy spawning processes faster than we can even terminate them. DataLoader iterators are not meant to be very short lived objects

samarthbhargav · 2018-06-25T16:05:12Z

My previous comment was incorrect. I discovered that the leak was elsewhere in the code (I was hanging on to vars without detaching for those who are curious).

SystemErrorWang · 2018-07-09T15:37:09Z

get "BrokenPipeError: [Errno 32] Broken pipe" when trying next(iter(dataloader))

shaibagon · 2018-10-28T11:19:17Z

I used this method to retrieve batches for training in a loop:

    for i in range(n):
       batch = next(iter(data_loader))

I noticed I keep getting the same batch, like the underlying __getitem__ of the dataset keeps getting the same item index.
Is this normal?

srossi93 · 2018-10-28T11:54:28Z

@shaibagon
It's not documented very well but when you do iter(dataloader) you create an object of class _DataLoaderIter and, in the loop, you'll create same object n times and retrieve the first batch only.
A workaround is to create a _DataLoaderIter outside the loop and iterate over it. The problem is that once all batches are retrieved, _DataLoaderIter will raise a StopIteration error.

To avoid problems, what I'm currently doing is the following:

    dataloader_iterator = iter(dataloader)
    for i in range(iterations):
        try:
            data, target = next(dataloader_iterator)
        except StopIteration:
            dataloader_iterator = iter(dataloader)
            data, target = next(dataloader_iterator)
        do_something()

It's very ugly but it works just fine.

shaibagon · 2018-11-14T06:01:05Z

similar issue can be found here I hope proposed solutions in this thread can help people there as well.

davidtvs · 2018-11-18T23:35:19Z

@srossi93 nice solution. Sometimes I get an ignored exception when the iterations cycle ends: ConnectionResetError: [Errno 104] Connection reset by peer.

Appears to be caused by multiprocessing. Setting the num_workers on the dataloader to 0 makes the error disappear. Any other solutions?

cuixing158 · 2019-01-16T08:21:12Z

thx @srossi93

eelxpeng · 2019-04-03T13:11:12Z

Maybe this code is a little better?

def inf_train_gen():
    while True:
        for images, targets in enumerate(dataloader):
            yield images, targets
gen = inf_train_gen
for it in range(num_iters):
    images, targets = gen.next()

Yamin05114 · 2019-04-08T17:56:05Z

Will the dataset shuffle if I use the code provided above?
dataloader_iterator = iter(dataloader) for i in range(iterations): try: X, Y = next(dataloader_iterator) except: dataloader_iterator = iter(train_loader) X, Y = next(dataloader_iterator) do_backprop(X, Y)

brando90 · 2019-06-10T16:22:05Z

why is this not a generator/iterator already?

KevLuo · 2019-11-06T17:57:12Z

@Yamin05114 I ran a small example to see if the result of iter(dataloader) is shuffled every time it is reset. If you run this little script below, you can look at the small set of print statements to confirm for yourself that the order is indeed shuffled. This is not a proof in general, but it is convincing evidence that the data is shuffled everytime we call iter(train_loader).

import torch
from torch.utils.data import Dataset, DataLoader

dataset = torch.tensor([0, 1, 2, 3, 4, 5, 6, 7])
dataloader = DataLoader(dataset, batch_size=2, shuffle=True, num_workers=3)
iterloader = iter(dataloader)

for i in range(0, 12):

    try:
        batch = next(iterloader)
    except StopIteration:
        iterloader = iter(dataloader)
        batch = next(iterloader)

    print("iteration" + str(i))
    print(batch)

In addition, I could not reproduce the error of @shaibagon ...the below code seems to produce distinct batches (using the same variables as defined above), so not sure what happened there.

for i in range(0, 12):
    batch = next(iter(dataloader))
    print("iteration: " + str(i))
    print(batch)

AlexTS1980 · 2020-01-14T16:31:10Z

If you have a dataset object that inherits data.Dataset from pytorch, it must override __getitem__ method, which uses idx as an argument. Therefore you can access it directly:

**some dataset instance called _data_
data=Dataset(**kwargs)
for i in range(10):
     data[i]

or

for i in range(10):
     data_batch.__getitem__(i)

brando90 · 2021-03-30T18:26:06Z

next(iter(data_loader)) ?

why is this not an iterator already?

SiddharthSingi · 2021-04-02T10:17:18Z

The enumerate function calls an iter() as well. So this seemed to work for me:

horse_loader = DataLoader(horse_dataset, batch_size=4, shuffle = True)

# To get a single batch from DataLoader, use:
horses=next(iter(horse_loader))

# Use this while iterating over entire dataset for training:
for epoch in range(5):
    for batch_no, horses in enumerate(horse_loader):
        print(f'horse_img shape: {horses[0].shape}')
        ''Train network over here''

etetteh · 2021-04-08T17:46:35Z

@KevLuo
Rewriting your code into the following:

import torch
from torch.utils.data import Dataset, DataLoader

dataset = torch.tensor([0, 1, 2, 3, 4, 5, 6, 7])
dataloader = DataLoader(dataset, batch_size=2, shuffle=True, num_workers=3)
iterloader = iter(dataloader)

for i in range(0, 35):
    batch = next(iterloader)
    print("iteration" + str(i))
    print(batch)

Outputs:

iteration0
tensor([4, 7])
iteration1
tensor([3, 0])
iteration2
tensor([2, 1])
iteration3
tensor([5, 6])
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-1-d1597c639a9f> in <module>
      7 
      8 for i in range(0, 35):
----> 9     batch = next(iterloader)
     10     print("iteration" + str(i))
     11     print(batch)

~/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py in __next__(self)
    515             if self._sampler_iter is None:
    516                 self._reset()
--> 517             data = self._next_data()
    518             self._num_yielded += 1
    519             if self._dataset_kind == _DatasetKind.Iterable and \

~/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py in _next_data(self)
   1170                 if not self._persistent_workers:
   1171                     self._shutdown_workers()
-> 1172                 raise StopIteration
   1173 
   1174             # Now `self._rcvd_idx` is the batch index we want to fetch

StopIteration:

Intuitively, this is what I/we expect to happen.

mamuncseru · 2021-10-12T11:04:13Z

next(iter(data_loader)) ?

It's worked for me. You saved my day

JiangM-C · 2022-05-16T16:20:13Z

@hyperfraise
how to solve this issue, I have met this problem too

Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Codegen changes include: - codegen improvement: i. improved view support on pointwise and transpose scheduler ii. grouped grid welford added for better outer-norm grid persistence in normalization - misc: i. new composite ops added: variance_mean , arange, ii. fixes misaligned address for transpose scheduler iii. refactor on separation of compilation API from execution API to prepare us for async compilation iv. double type support on expression evaluator v. PYTORCH_NVFUSER_DUMP refactor to save PTX and CUBIN Commits that's in this PR from the devel branch: ``` 89330aa Tensor factories must set the output shape as its input (#1939) b2fd01e arange support (#1933) 56c00fd Double support on all expression evaluators (#1937) 371f282 Improve trivial reduction merge support (#1931) 1d0c267 Test `rand` in a fusion with zero tensor input (#1932) 0dab160 Fix softmax bwd sizes. (#1890) ef98f36 Fix a bug (#1936) 63132a0 Propagate permissive mapping information into indexing pass (#1929) b4ac2c8 Map IterationDomains through view operations. (#1919) c0a187a do not use deprecated functions (#1935) 88de85e Upstream cherry pick fixes 0811 (#1934) b247dcf Separate kernel compilation API from kernel execution API (#1914) b34e3b9 Fix `ir_utils::hasBlockSync` + misc fixes in transpose scheduler (#1924) 14a53e6 Nullary RNGOp (#1892) 3c3c89e Misc fixes/tuning for transpose scheduler (#1912) 20cf109 Grouped grid welford (#1921) 6cf7eb0 Transpose scheduler small dim sizes better support (#1910) 9341ea9 Disabled ViewPersistentShmoo sizes that results in NAN (#1922) 057237f Fix CUDA driver error: misaligned address for transpose scheduler (#1918) 3fb3d80 Add variance_mean function using Welford (#1907) 98febf6 Remove DisableOption::UnrollWithRng (#1913) ee8ef33 Minor fix for the debug interface of using PTX directly (#1917) 6e8f953 Add PYTORCH_NVFUSER_DUMP options to save PTX and CUBIN (#1916) 5eefa9a dopt is only available since nvrtc 11.7 (#1915) 2ec8fc7 Kill computeAtBetween (#1911) d0d106a Improve view support on pointwise and transpose scheduler (#1906) e71e1ec Fix name clash of RNG with shared memory (#1904) 3381793 Fix mutator and sameAs for expanded IterDomain (#1902) ``` RUN_TORCHBENCH: nvfuser Differential Revision: [D39324552](https://our.internmc.facebook.com/intern/diff/D39324552) [ghstack-poisoned]

Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Codegen changes include: - codegen improvement: i. improved view support on pointwise and transpose scheduler ii. grouped grid welford added for better outer-norm grid persistence in normalization - misc: i. new composite ops added: variance_mean , arange, ii. fixes misaligned address for transpose scheduler iii. refactor on separation of compilation API from execution API to prepare us for async compilation iv. double type support on expression evaluator v. PYTORCH_NVFUSER_DUMP refactor to save PTX and CUBIN Commits that's in this PR from the devel branch: ``` 89330aa Tensor factories must set the output shape as its input (#1939) b2fd01e arange support (#1933) 56c00fd Double support on all expression evaluators (#1937) 371f282 Improve trivial reduction merge support (#1931) 1d0c267 Test `rand` in a fusion with zero tensor input (#1932) 0dab160 Fix softmax bwd sizes. (#1890) ef98f36 Fix a bug (#1936) 63132a0 Propagate permissive mapping information into indexing pass (#1929) b4ac2c8 Map IterationDomains through view operations. (#1919) c0a187a do not use deprecated functions (#1935) 88de85e Upstream cherry pick fixes 0811 (#1934) b247dcf Separate kernel compilation API from kernel execution API (#1914) b34e3b9 Fix `ir_utils::hasBlockSync` + misc fixes in transpose scheduler (#1924) 14a53e6 Nullary RNGOp (#1892) 3c3c89e Misc fixes/tuning for transpose scheduler (#1912) 20cf109 Grouped grid welford (#1921) 6cf7eb0 Transpose scheduler small dim sizes better support (#1910) 9341ea9 Disabled ViewPersistentShmoo sizes that results in NAN (#1922) 057237f Fix CUDA driver error: misaligned address for transpose scheduler (#1918) 3fb3d80 Add variance_mean function using Welford (#1907) 98febf6 Remove DisableOption::UnrollWithRng (#1913) ee8ef33 Minor fix for the debug interface of using PTX directly (#1917) 6e8f953 Add PYTORCH_NVFUSER_DUMP options to save PTX and CUBIN (#1916) 5eefa9a dopt is only available since nvrtc 11.7 (#1915) 2ec8fc7 Kill computeAtBetween (#1911) d0d106a Improve view support on pointwise and transpose scheduler (#1906) e71e1ec Fix name clash of RNG with shared memory (#1904) 3381793 Fix mutator and sameAs for expanded IterDomain (#1902) ``` RUN_TORCHBENCH: nvfuser ghstack-source-id: d0d88cff0c908b2f0ebf6defaab10bc3e7b437b5 Pull Request resolved: #84626

Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Codegen changes include: - codegen improvement: i. improved view support on pointwise and transpose scheduler ii. grouped grid welford added for better outer-norm grid persistence in normalization - misc: i. new composite ops added: variance_mean , arange, ii. fixes misaligned address for transpose scheduler iii. refactor on separation of compilation API from execution API to prepare us for async compilation iv. double type support on expression evaluator v. PYTORCH_NVFUSER_DUMP refactor to save PTX and CUBIN Commits that's in this PR from the devel branch: ``` 89330aa Tensor factories must set the output shape as its input (#1939) b2fd01e arange support (#1933) 56c00fd Double support on all expression evaluators (#1937) 371f282 Improve trivial reduction merge support (#1931) 1d0c267 Test `rand` in a fusion with zero tensor input (#1932) 0dab160 Fix softmax bwd sizes. (#1890) ef98f36 Fix a bug (#1936) 63132a0 Propagate permissive mapping information into indexing pass (#1929) b4ac2c8 Map IterationDomains through view operations. (#1919) c0a187a do not use deprecated functions (#1935) 88de85e Upstream cherry pick fixes 0811 (#1934) b247dcf Separate kernel compilation API from kernel execution API (#1914) b34e3b9 Fix `ir_utils::hasBlockSync` + misc fixes in transpose scheduler (#1924) 14a53e6 Nullary RNGOp (#1892) 3c3c89e Misc fixes/tuning for transpose scheduler (#1912) 20cf109 Grouped grid welford (#1921) 6cf7eb0 Transpose scheduler small dim sizes better support (#1910) 9341ea9 Disabled ViewPersistentShmoo sizes that results in NAN (#1922) 057237f Fix CUDA driver error: misaligned address for transpose scheduler (#1918) 3fb3d80 Add variance_mean function using Welford (#1907) 98febf6 Remove DisableOption::UnrollWithRng (#1913) ee8ef33 Minor fix for the debug interface of using PTX directly (#1917) 6e8f953 Add PYTORCH_NVFUSER_DUMP options to save PTX and CUBIN (#1916) 5eefa9a dopt is only available since nvrtc 11.7 (#1915) 2ec8fc7 Kill computeAtBetween (#1911) d0d106a Improve view support on pointwise and transpose scheduler (#1906) e71e1ec Fix name clash of RNG with shared memory (#1904) 3381793 Fix mutator and sameAs for expanded IterDomain (#1902) ``` RUN_TORCHBENCH: nvfuser ghstack-source-id: dec022a19dc6d511da1008b35d88ea1789e933a4 Pull Request resolved: #84626

Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Codegen changes include: - codegen improvement: i. improved view support on pointwise and transpose scheduler ii. grouped grid welford added for better outer-norm grid persistence in normalization - misc: i. new composite ops added: variance_mean , arange, ii. fixes misaligned address for transpose scheduler iii. refactor on separation of compilation API from execution API to prepare us for async compilation iv. double type support on expression evaluator v. PYTORCH_NVFUSER_DUMP refactor to save PTX and CUBIN Commits that's in this PR from the devel branch: ``` 89330aa Tensor factories must set the output shape as its input (#1939) b2fd01e arange support (#1933) 56c00fd Double support on all expression evaluators (#1937) 371f282 Improve trivial reduction merge support (#1931) 1d0c267 Test `rand` in a fusion with zero tensor input (#1932) 0dab160 Fix softmax bwd sizes. (#1890) ef98f36 Fix a bug (#1936) 63132a0 Propagate permissive mapping information into indexing pass (#1929) b4ac2c8 Map IterationDomains through view operations. (#1919) c0a187a do not use deprecated functions (#1935) 88de85e Upstream cherry pick fixes 0811 (#1934) b247dcf Separate kernel compilation API from kernel execution API (#1914) b34e3b9 Fix `ir_utils::hasBlockSync` + misc fixes in transpose scheduler (#1924) 14a53e6 Nullary RNGOp (#1892) 3c3c89e Misc fixes/tuning for transpose scheduler (#1912) 20cf109 Grouped grid welford (#1921) 6cf7eb0 Transpose scheduler small dim sizes better support (#1910) 9341ea9 Disabled ViewPersistentShmoo sizes that results in NAN (#1922) 057237f Fix CUDA driver error: misaligned address for transpose scheduler (#1918) 3fb3d80 Add variance_mean function using Welford (#1907) 98febf6 Remove DisableOption::UnrollWithRng (#1913) ee8ef33 Minor fix for the debug interface of using PTX directly (#1917) 6e8f953 Add PYTORCH_NVFUSER_DUMP options to save PTX and CUBIN (#1916) 5eefa9a dopt is only available since nvrtc 11.7 (#1915) 2ec8fc7 Kill computeAtBetween (#1911) d0d106a Improve view support on pointwise and transpose scheduler (#1906) e71e1ec Fix name clash of RNG with shared memory (#1904) 3381793 Fix mutator and sameAs for expanded IterDomain (#1902) ``` RUN_TORCHBENCH: nvfuser Differential Revision: [D39324552](https://our.internmc.facebook.com/intern/diff/D39324552) [ghstack-poisoned]

Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Codegen changes include: - codegen improvement: i. improved view support on pointwise and transpose scheduler ii. grouped grid welford added for better outer-norm grid persistence in normalization - misc: i. new composite ops added: variance_mean , arange, ii. fixes misaligned address for transpose scheduler iii. refactor on separation of compilation API from execution API to prepare us for async compilation iv. double type support on expression evaluator v. PYTORCH_NVFUSER_DUMP refactor to save PTX and CUBIN Commits that's in this PR from the devel branch: ``` 89330aa Tensor factories must set the output shape as its input (#1939) b2fd01e arange support (#1933) 56c00fd Double support on all expression evaluators (#1937) 371f282 Improve trivial reduction merge support (#1931) 1d0c267 Test `rand` in a fusion with zero tensor input (#1932) 0dab160 Fix softmax bwd sizes. (#1890) ef98f36 Fix a bug (#1936) 63132a0 Propagate permissive mapping information into indexing pass (#1929) b4ac2c8 Map IterationDomains through view operations. (#1919) c0a187a do not use deprecated functions (#1935) 88de85e Upstream cherry pick fixes 0811 (#1934) b247dcf Separate kernel compilation API from kernel execution API (#1914) b34e3b9 Fix `ir_utils::hasBlockSync` + misc fixes in transpose scheduler (#1924) 14a53e6 Nullary RNGOp (#1892) 3c3c89e Misc fixes/tuning for transpose scheduler (#1912) 20cf109 Grouped grid welford (#1921) 6cf7eb0 Transpose scheduler small dim sizes better support (#1910) 9341ea9 Disabled ViewPersistentShmoo sizes that results in NAN (#1922) 057237f Fix CUDA driver error: misaligned address for transpose scheduler (#1918) 3fb3d80 Add variance_mean function using Welford (#1907) 98febf6 Remove DisableOption::UnrollWithRng (#1913) ee8ef33 Minor fix for the debug interface of using PTX directly (#1917) 6e8f953 Add PYTORCH_NVFUSER_DUMP options to save PTX and CUBIN (#1916) 5eefa9a dopt is only available since nvrtc 11.7 (#1915) 2ec8fc7 Kill computeAtBetween (#1911) d0d106a Improve view support on pointwise and transpose scheduler (#1906) e71e1ec Fix name clash of RNG with shared memory (#1904) 3381793 Fix mutator and sameAs for expanded IterDomain (#1902) ``` RUN_TORCHBENCH: nvfuser ghstack-source-id: 0443a2a479aa4867cdd42dacb6bbecd7e5e43fd8 Pull Request resolved: #84626

Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Codegen changes include: - codegen improvement: i. improved view support on pointwise and transpose scheduler ii. grouped grid welford added for better outer-norm grid persistence in normalization - misc: i. new composite ops added: variance_mean , arange, ii. fixes misaligned address for transpose scheduler iii. refactor on separation of compilation API from execution API to prepare us for async compilation iv. double type support on expression evaluator v. PYTORCH_NVFUSER_DUMP refactor to save PTX and CUBIN Commits that's in this PR from the devel branch: ``` 89330aa Tensor factories must set the output shape as its input (#1939) b2fd01e arange support (#1933) 56c00fd Double support on all expression evaluators (#1937) 371f282 Improve trivial reduction merge support (#1931) 1d0c267 Test `rand` in a fusion with zero tensor input (#1932) 0dab160 Fix softmax bwd sizes. (#1890) ef98f36 Fix a bug (#1936) 63132a0 Propagate permissive mapping information into indexing pass (#1929) b4ac2c8 Map IterationDomains through view operations. (#1919) c0a187a do not use deprecated functions (#1935) 88de85e Upstream cherry pick fixes 0811 (#1934) b247dcf Separate kernel compilation API from kernel execution API (#1914) b34e3b9 Fix `ir_utils::hasBlockSync` + misc fixes in transpose scheduler (#1924) 14a53e6 Nullary RNGOp (#1892) 3c3c89e Misc fixes/tuning for transpose scheduler (#1912) 20cf109 Grouped grid welford (#1921) 6cf7eb0 Transpose scheduler small dim sizes better support (#1910) 9341ea9 Disabled ViewPersistentShmoo sizes that results in NAN (#1922) 057237f Fix CUDA driver error: misaligned address for transpose scheduler (#1918) 3fb3d80 Add variance_mean function using Welford (#1907) 98febf6 Remove DisableOption::UnrollWithRng (#1913) ee8ef33 Minor fix for the debug interface of using PTX directly (#1917) 6e8f953 Add PYTORCH_NVFUSER_DUMP options to save PTX and CUBIN (#1916) 5eefa9a dopt is only available since nvrtc 11.7 (#1915) 2ec8fc7 Kill computeAtBetween (#1911) d0d106a Improve view support on pointwise and transpose scheduler (#1906) e71e1ec Fix name clash of RNG with shared memory (#1904) 3381793 Fix mutator and sameAs for expanded IterDomain (#1902) ``` RUN_TORCHBENCH: nvfuser Differential Revision: [D39324552](https://our.internmc.facebook.com/intern/diff/D39324552) [ghstack-poisoned]

Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Codegen changes include: - codegen improvement: i. improved view support on pointwise and transpose scheduler ii. grouped grid welford added for better outer-norm grid persistence in normalization - misc: i. new composite ops added: variance_mean , arange, ii. fixes misaligned address for transpose scheduler iii. refactor on separation of compilation API from execution API to prepare us for async compilation iv. double type support on expression evaluator v. PYTORCH_NVFUSER_DUMP refactor to save PTX and CUBIN Commits that's in this PR from the devel branch: ``` 89330aa Tensor factories must set the output shape as its input (#1939) b2fd01e arange support (#1933) 56c00fd Double support on all expression evaluators (#1937) 371f282 Improve trivial reduction merge support (#1931) 1d0c267 Test `rand` in a fusion with zero tensor input (#1932) 0dab160 Fix softmax bwd sizes. (#1890) ef98f36 Fix a bug (#1936) 63132a0 Propagate permissive mapping information into indexing pass (#1929) b4ac2c8 Map IterationDomains through view operations. (#1919) c0a187a do not use deprecated functions (#1935) 88de85e Upstream cherry pick fixes 0811 (#1934) b247dcf Separate kernel compilation API from kernel execution API (#1914) b34e3b9 Fix `ir_utils::hasBlockSync` + misc fixes in transpose scheduler (#1924) 14a53e6 Nullary RNGOp (#1892) 3c3c89e Misc fixes/tuning for transpose scheduler (#1912) 20cf109 Grouped grid welford (#1921) 6cf7eb0 Transpose scheduler small dim sizes better support (#1910) 9341ea9 Disabled ViewPersistentShmoo sizes that results in NAN (#1922) 057237f Fix CUDA driver error: misaligned address for transpose scheduler (#1918) 3fb3d80 Add variance_mean function using Welford (#1907) 98febf6 Remove DisableOption::UnrollWithRng (#1913) ee8ef33 Minor fix for the debug interface of using PTX directly (#1917) 6e8f953 Add PYTORCH_NVFUSER_DUMP options to save PTX and CUBIN (#1916) 5eefa9a dopt is only available since nvrtc 11.7 (#1915) 2ec8fc7 Kill computeAtBetween (#1911) d0d106a Improve view support on pointwise and transpose scheduler (#1906) e71e1ec Fix name clash of RNG with shared memory (#1904) 3381793 Fix mutator and sameAs for expanded IterDomain (#1902) ``` RUN_TORCHBENCH: nvfuser ghstack-source-id: 34c0b924cd95c85c61480be1aa5b755b51187b07 Pull Request resolved: #84626

Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Codegen changes include: - codegen improvement: i. improved view support on pointwise and transpose scheduler ii. grouped grid welford added for better outer-norm grid persistence in normalization - misc: i. new composite ops added: variance_mean , arange, ii. fixes misaligned address for transpose scheduler iii. refactor on separation of compilation API from execution API to prepare us for async compilation iv. double type support on expression evaluator v. PYTORCH_NVFUSER_DUMP refactor to save PTX and CUBIN Commits that's in this PR from the devel branch: ``` 89330aa Tensor factories must set the output shape as its input (#1939) b2fd01e arange support (#1933) 56c00fd Double support on all expression evaluators (#1937) 371f282 Improve trivial reduction merge support (#1931) 1d0c267 Test `rand` in a fusion with zero tensor input (#1932) 0dab160 Fix softmax bwd sizes. (#1890) ef98f36 Fix a bug (#1936) 63132a0 Propagate permissive mapping information into indexing pass (#1929) b4ac2c8 Map IterationDomains through view operations. (#1919) c0a187a do not use deprecated functions (#1935) 88de85e Upstream cherry pick fixes 0811 (#1934) b247dcf Separate kernel compilation API from kernel execution API (#1914) b34e3b9 Fix `ir_utils::hasBlockSync` + misc fixes in transpose scheduler (#1924) 14a53e6 Nullary RNGOp (#1892) 3c3c89e Misc fixes/tuning for transpose scheduler (#1912) 20cf109 Grouped grid welford (#1921) 6cf7eb0 Transpose scheduler small dim sizes better support (#1910) 9341ea9 Disabled ViewPersistentShmoo sizes that results in NAN (#1922) 057237f Fix CUDA driver error: misaligned address for transpose scheduler (#1918) 3fb3d80 Add variance_mean function using Welford (#1907) 98febf6 Remove DisableOption::UnrollWithRng (#1913) ee8ef33 Minor fix for the debug interface of using PTX directly (#1917) 6e8f953 Add PYTORCH_NVFUSER_DUMP options to save PTX and CUBIN (#1916) 5eefa9a dopt is only available since nvrtc 11.7 (#1915) 2ec8fc7 Kill computeAtBetween (#1911) d0d106a Improve view support on pointwise and transpose scheduler (#1906) e71e1ec Fix name clash of RNG with shared memory (#1904) 3381793 Fix mutator and sameAs for expanded IterDomain (#1902) ``` RUN_TORCHBENCH: nvfuser Differential Revision: [D39324552](https://our.internmc.facebook.com/intern/diff/D39324552) Pull Request resolved: #84626 Approved by: https://github.com/malfet

Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Codegen changes include: - codegen improvement: i. improved view support on pointwise and transpose scheduler ii. grouped grid welford added for better outer-norm grid persistence in normalization - misc: i. new composite ops added: variance_mean , arange, ii. fixes misaligned address for transpose scheduler iii. refactor on separation of compilation API from execution API to prepare us for async compilation iv. double type support on expression evaluator v. PYTORCH_NVFUSER_DUMP refactor to save PTX and CUBIN Commits that's in this PR from the devel branch: ``` 89330aa Tensor factories must set the output shape as its input (pytorch#1939) b2fd01e arange support (pytorch#1933) 56c00fd Double support on all expression evaluators (pytorch#1937) 371f282 Improve trivial reduction merge support (pytorch#1931) 1d0c267 Test `rand` in a fusion with zero tensor input (pytorch#1932) 0dab160 Fix softmax bwd sizes. (pytorch#1890) ef98f36 Fix a bug (pytorch#1936) 63132a0 Propagate permissive mapping information into indexing pass (pytorch#1929) b4ac2c8 Map IterationDomains through view operations. (pytorch#1919) c0a187a do not use deprecated functions (pytorch#1935) 88de85e Upstream cherry pick fixes 0811 (pytorch#1934) b247dcf Separate kernel compilation API from kernel execution API (pytorch#1914) b34e3b9 Fix `ir_utils::hasBlockSync` + misc fixes in transpose scheduler (pytorch#1924) 14a53e6 Nullary RNGOp (pytorch#1892) 3c3c89e Misc fixes/tuning for transpose scheduler (pytorch#1912) 20cf109 Grouped grid welford (pytorch#1921) 6cf7eb0 Transpose scheduler small dim sizes better support (pytorch#1910) 9341ea9 Disabled ViewPersistentShmoo sizes that results in NAN (pytorch#1922) 057237f Fix CUDA driver error: misaligned address for transpose scheduler (pytorch#1918) 3fb3d80 Add variance_mean function using Welford (pytorch#1907) 98febf6 Remove DisableOption::UnrollWithRng (pytorch#1913) ee8ef33 Minor fix for the debug interface of using PTX directly (pytorch#1917) 6e8f953 Add PYTORCH_NVFUSER_DUMP options to save PTX and CUBIN (pytorch#1916) 5eefa9a dopt is only available since nvrtc 11.7 (pytorch#1915) 2ec8fc7 Kill computeAtBetween (pytorch#1911) d0d106a Improve view support on pointwise and transpose scheduler (pytorch#1906) e71e1ec Fix name clash of RNG with shared memory (pytorch#1904) 3381793 Fix mutator and sameAs for expanded IterDomain (pytorch#1902) ``` RUN_TORCHBENCH: nvfuser Differential Revision: [D39324552](https://our.internmc.facebook.com/intern/diff/D39324552) Pull Request resolved: pytorch#84626 Approved by: https://github.com/malfet

narendasan closed this as completed Jun 26, 2017

gwarmstrong mentioned this issue Mar 8, 2020

See this tutorial for usering iter(dataloader) gwarmstrong/cse253-final-project#29

Open

yxchng mentioned this issue Nov 12, 2020

DataLoader gives "Broken pipe" error on Linux platform #46802

Open

Mattdl mentioned this issue Apr 21, 2021

Possible Replay bug ContinualAI/avalanche#543

Closed

jjsjann123 pushed a commit to jjsjann123/pytorch that referenced this issue Aug 30, 2022

Minor fix for the debug interface of using PTX directly (pytorch#1917)

ee8ef33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get a single batch from DataLoader without iterating #1917

Get a single batch from DataLoader without iterating #1917

narendasan commented Jun 26, 2017

colesbury commented Jun 26, 2017

narendasan commented Jun 26, 2017

hyperfraise commented Feb 2, 2018

samarthbhargav commented Jun 20, 2018

srossi93 commented Jun 25, 2018

apaszke commented Jun 25, 2018

samarthbhargav commented Jun 25, 2018

SystemErrorWang commented Jul 9, 2018

shaibagon commented Oct 28, 2018

srossi93 commented Oct 28, 2018

shaibagon commented Nov 14, 2018

davidtvs commented Nov 18, 2018

cuixing158 commented Jan 16, 2019

eelxpeng commented Apr 3, 2019 •

edited

Yamin05114 commented Apr 8, 2019 •

edited

brando90 commented Jun 10, 2019

KevLuo commented Nov 6, 2019 •

edited

AlexTS1980 commented Jan 14, 2020 •

edited

brando90 commented Mar 30, 2021

SiddharthSingi commented Apr 2, 2021

etetteh commented Apr 8, 2021

mamuncseru commented Oct 12, 2021

JiangM-C commented May 16, 2022

Get a single batch from DataLoader without iterating #1917

Get a single batch from DataLoader without iterating #1917

Comments

narendasan commented Jun 26, 2017

colesbury commented Jun 26, 2017

narendasan commented Jun 26, 2017

hyperfraise commented Feb 2, 2018

samarthbhargav commented Jun 20, 2018

srossi93 commented Jun 25, 2018

apaszke commented Jun 25, 2018

samarthbhargav commented Jun 25, 2018

SystemErrorWang commented Jul 9, 2018

shaibagon commented Oct 28, 2018

srossi93 commented Oct 28, 2018

shaibagon commented Nov 14, 2018

davidtvs commented Nov 18, 2018

cuixing158 commented Jan 16, 2019

eelxpeng commented Apr 3, 2019 • edited

Yamin05114 commented Apr 8, 2019 • edited

brando90 commented Jun 10, 2019

KevLuo commented Nov 6, 2019 • edited

AlexTS1980 commented Jan 14, 2020 • edited

brando90 commented Mar 30, 2021

SiddharthSingi commented Apr 2, 2021

etetteh commented Apr 8, 2021

mamuncseru commented Oct 12, 2021

JiangM-C commented May 16, 2022

eelxpeng commented Apr 3, 2019 •

edited

Yamin05114 commented Apr 8, 2019 •

edited

KevLuo commented Nov 6, 2019 •

edited

AlexTS1980 commented Jan 14, 2020 •

edited