You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
defrun(self) ->None:
"""Launch training."""self.runner.call_hook('before_train')
# In iteration-based training loop, we treat the whole training process# as a big epoch and execute the corresponding hook.self.runner.call_hook('before_train_epoch')
ifself._iter>0:
print_log(
f'Advance dataloader {self._iter} steps to skip data ''that has already been trained',
logger='current',
level=logging.WARNING)
# mockold_getitem=self.dataloader_iterator.dataset.__getitem__self.dataloader_iterator.dataset.__getitem__=a_new_getitem_methodfor_inrange(self._iter):
next(self.dataloader_iterator)
self.dataloader_iterator.dataset.__getitem__=old_getitem
I believe this PR is the cause of the issue: #1471.
While it fixed the resume iteration problem, it also led to slow resume speed. A suitable solution would be to call the _next_index() method of the DataLoader's built-in iterator to skip a batch without reading the data.
What is the feature?
mmengine/mmengine/runner/loops.py
Line 281 in 2c4516c
现有的恢复方式会对dataloader 迭代 n 个step,当n较大时,速度会很慢,因为执行了实际的数据加载和处理逻辑。 是否有比较好的方式只迭代index,不执行实际的数据加载流程。
尝试直接迭代batch_sampler 在worker=0的时候是正常的,在多worker的时候恢复数据顺序出现错误。 像知道有没有什么比较好的解决方案
Any other context?
https://discuss.pytorch.org/t/is-there-any-way-to-skip-steps-in-a-dataloader/123201
https://pytorch.org/data/main/dataloader2.html
Snapshot the state of data-preprocessing pipeline (WIP)
The text was updated successfully, but these errors were encountered: