Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error related to raw_sample_filter in _create_data_loader #271

Open
mmuckley opened this issue Oct 3, 2022 Discussed in #263 · 0 comments
Open

Error related to raw_sample_filter in _create_data_loader #271

mmuckley opened this issue Oct 3, 2022 Discussed in #263 · 0 comments

Comments

@mmuckley
Copy link
Contributor

mmuckley commented Oct 3, 2022

Discussed in #263

Creating an issue with this - seems like some aspects of sample filtering are bugged with recent changes.

Originally posted by mouryarahul August 24, 2022
Hi,
I'm trying to run
python train_unet_demo.py \
--mode test \
--test_split test \
--challenge singlecoil \
--data_path ../../../FastMRI_DATASET/knee_singlecoil_train/ \
--resume_from_checkpoint unet/unet_demo/checkpoints/epoch=1-step=69484.ckpt

where ../../../FastMRI_DATASET/knee_singlecoil_train/ contains all three folders: singlecoil_test, singlecoil_train and singlecoil_val

However, I'm getting an error related to raw_sample_filter in the case of the test dataset. Maybe I am missing something or doing something silly. Can someone please point out the mistake? Thanks!

Info about my environment:
PyTorch version: 1.12.0+cu116
Is debug build: False
CUDA used to build PyTorch: 11.6
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04 LTS (x86_64)
GCC version: (Ubuntu 11.2.0-19ubuntu1) 11.2.0
Clang version: Could not collect
CMake version: version 3.22.1
Libc version: glibc-2.35

Python version: 3.10.4 (main, Mar 31 2022, 08:41:55) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.15.0-46-generic-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1070
Nvidia driver version: 515.65.01
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.22.3
[pip3] pytorch-lightning==1.7.2
[pip3] torch==1.12.0+cu116
[pip3] torchaudio==0.12.0+cu116
[pip3] torchmetrics==0.9.2
[pip3] torchvision==0.13.0+cu116
[conda] blas 1.0 mkl
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py310h7f8727e_0
[conda] mkl_fft 1.3.1 py310hd6ae3a3_0
[conda] mkl_random 1.2.2 py310h00e6091_0
[conda] numpy 1.22.3 py310hfa59a62_0
[conda] numpy-base 1.22.3 py310h9585f30_0
[conda] pytorch-lightning 1.7.2 pypi_0 pypi
[conda] torch 1.12.0+cu116 pypi_0 pypi
[conda] torchaudio 0.12.0+cu116 pypi_0 pypi
[conda] torchmetrics 0.9.2 pypi_0 pypi
[conda] torchvision 0.13.0+cu116 pypi_0 pypi

The full error msg:

Global seed set to 42
/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/torchmetrics/utilities/prints.py:36: UserWarning: Torchmetrics v0.9 introduced a new argument class property called full_state_update that has not been set for this class (DistributedMetricSum). The property determines if update by default needs access to the full metric state. If this is not the case, significant speedups can be achieved and we recommend setting this to False. We provide an checking function from torchmetrics.utilities import check_forward_no_full_state that can be used to check if the full_state_update=True (old and potential slower behaviour, default for now) or if full_state_update=False can be used safely.
warnings.warn(*args, **kwargs)
/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:446: LightningDeprecationWarning: Setting Trainer(gpus=1) is deprecated in v1.7 and will be removed in v2.0. Please use Trainer(accelerator='gpu', devices=1) instead.
rank_zero_deprecation(
/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py:52: LightningDeprecationWarning: Setting Trainer(resume_from_checkpoint=) is deprecated in v1.5 and will be removed in v1.7. Please pass Trainer.fit(ckpt_path=) directly instead.
rank_zero_deprecation(
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Global seed set to 42
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1

distributed_backend=nccl
All distributed processes registered. Starting with 1 processes

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Traceback (most recent call last):
File "/media/rahul/DATA/WorkSpace/Multimodal-Data-Processing/Projects/fastMRI/fastmri_examples/unet/train_unet_demo.py", line 191, in
run_cli()
File "/media/rahul/DATA/WorkSpace/Multimodal-Data-Processing/Projects/fastMRI/fastmri_examples/unet/train_unet_demo.py", line 187, in run_cli
cli_main(args)
File "/media/rahul/DATA/WorkSpace/Multimodal-Data-Processing/Projects/fastMRI/fastmri_examples/unet/train_unet_demo.py", line 75, in cli_main
trainer.test(model, datamodule=data_module)
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 864, in test
return self._call_and_handle_interrupt(self._test_impl, model, dataloaders, ckpt_path, verbose, datamodule)
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 648, in _call_and_handle_interrupt
return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, **kwargs)
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch
return function(*args, **kwargs)
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 911, in _test_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1168, in _run
results = self._run_stage()
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1251, in _run_stage
return self._run_evaluate()
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1291, in _run_evaluate
self._evaluation_loop._reload_evaluation_dataloaders()
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 234, in _reload_evaluation_dataloaders
self.trainer.reset_test_dataloader()
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1944, in reset_test_dataloader
self.num_test_batches, self.test_dataloaders = self._data_connector._reset_eval_dataloader(
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 348, in _reset_eval_dataloader
dataloaders = self._request_dataloader(mode)
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 436, in _request_dataloader
dataloader = source.dataloader()
File "/home/rahul/anaconda3/envs/pytorch/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 513, in dataloader
return method()
File "/media/rahul/DATA/WorkSpace/Multimodal-Data-Processing/Projects/fastMRI/fastmri/pl_modules/data_module.py", line 325, in test_dataloader
return self._create_data_loader(
File "/media/rahul/DATA/WorkSpace/Multimodal-Data-Processing/Projects/fastMRI/fastmri/pl_modules/data_module.py", line 262, in _create_data_loader
raw_sample_filter=raw_sample_filter,
UnboundLocalError: local variable 'raw_sample_filter' referenced before assignment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant