Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Error when training model - TypeError: BaseDataset.__init__() got an unexpected keyword argument 'split' #1858

Open
TNodeCode opened this issue Dec 30, 2023 · 5 comments

Comments

@TNodeCode
Copy link

Branch

main branch (mmpretrain version)

Describe the bug

I have tried to train a model on a custom dataset using the mmpretrain library.

First I cloned the repository, then I created a dataset folder with the following structure:

  • data
    -- custom_dataset
    --- train
    --- test
    --- val

Next I followed the documentation (https://mmpretrain.readthedocs.io/en/latest/user_guides/train.html) on how to train a classification model on a custom dataset.

I created a new configuration file:

configs/mobilenet_v2/mobilenet-v2_finetune.py

_base_ = [
    '../_base_/models/mobilenet_v2_1x.py',
    '../_base_/datasets/imagenet_bs32_pil_resize.py',
    '../_base_/schedules/imagenet_bs256_epochstep.py',
    '../_base_/default_runtime.py'
]


# model settings
model = dict(
    backbone=dict(
        frozen_stages=2,
        init_cfg=dict(
            type='Pretrained',
            checkpoint='https://download.openmmlab.com/mmclassification/v0/mobilenet_v2/mobilenet_v2_batch256_imagenet_20200708-3b2dc3af.pth',
            prefix='backbone',
        )),
    head=dict(num_classes=10),
)

# data settings
data_root = 'data/custom_dataset'
train_dataloader = dict(
    dataset=dict(
        type='CustomDataset',
        data_root=data_root,
        ann_file='',       # We assume you are using the sub-folder format without ann_file
        data_prefix='train',
    ))
val_dataloader = dict(
    dataset=dict(
        type='CustomDataset',
        data_root=data_root,
        ann_file='',       # We assume you are using the sub-folder format without ann_file
        data_prefix='val',
    ))
test_dataloader = dict(
    dataset=dict(
        type='CustomDataset',
        data_root=data_root,
        ann_file='',       # We assume you are using the sub-folder format without ann_file
        data_prefix='test',
    ))

# schedule settings
optim_wrapper = dict(
    optimizer=dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001))
param_scheduler = dict(
    type='MultiStepLR', by_epoch=True, milestones=[15], gamma=0.1)

I then tried to train the model on my custom dataset with the command python ./tools/train.py ./configs/mobilenet_v2/mobilenet-v2_finetune.py

Then I get the following error:

C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\cuda\__init__.py:107: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at ..\c10\cuda\CUDAFunctions.cpp:109.)
  return torch._C._cuda_getDeviceCount() > 0
12/30 17:42:16 - mmengine - INFO -
------------------------------------------------------------
System environment:
    sys.platform: win32
    Python: 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
    CUDA available: False
    numpy_random_seed: 1691281147
    MSVC: Microsoft (R) C/C++-Optimierungscompiler Version 19.26.28806 für x64
    GCC: n/a
    PyTorch: 2.0.1+cu117
    PyTorch compiling details: PyTorch built with:
  - C++ Version: 199711
  - MSVC 193431937
  - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
  - OpenMP 2019
  - LAPACK is enabled (usually provided by MKL)
  - CPU capability usage: AVX2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj /FS -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=OFF, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF,

    TorchVision: 0.15.2+cu117
    OpenCV: 4.7.0
    MMEngine: 0.10.2

Runtime environment:
    cudnn_benchmark: False
    mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0}
    dist_cfg: {'backend': 'nccl'}
    seed: 1691281147
    deterministic: False
    Distributed launcher: none
    Distributed training: False
    GPU number: 1
------------------------------------------------------------

12/30 17:42:16 - mmengine - INFO - Config:
auto_scale_lr = dict(base_batch_size=256)
data_preprocessor = dict(
    mean=[
        123.675,
        116.28,
        103.53,
    ],
    num_classes=1000,
    std=[
        58.395,
        57.12,
        57.375,
    ],
    to_rgb=True)
data_root = 'data/custom_dataset'
dataset_type = 'ImageNet'
default_hooks = dict(
    checkpoint=dict(interval=1, type='CheckpointHook'),
    logger=dict(interval=100, type='LoggerHook'),
    param_scheduler=dict(type='ParamSchedulerHook'),
    sampler_seed=dict(type='DistSamplerSeedHook'),
    timer=dict(type='IterTimerHook'),
    visualization=dict(enable=False, type='VisualizationHook'))
default_scope = 'mmpretrain'
env_cfg = dict(
    cudnn_benchmark=False,
    dist_cfg=dict(backend='nccl'),
    mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0))
launcher = 'none'
load_from = None
log_level = 'INFO'
model = dict(
    backbone=dict(
        frozen_stages=2,
        init_cfg=dict(
            checkpoint=
            'https://download.openmmlab.com/mmclassification/v0/mobilenet_v2/mobilenet_v2_batch256_imagenet_20200708-3b2dc3af.pth',
            prefix='backbone',
            type='Pretrained'),
        type='MobileNetV2',
        widen_factor=1.0),
    head=dict(
        in_channels=1280,
        loss=dict(loss_weight=1.0, type='CrossEntropyLoss'),
        num_classes=10,
        topk=(
            1,
            5,
        ),
        type='LinearClsHead'),
    neck=dict(type='GlobalAveragePooling'),
    type='ImageClassifier')
optim_wrapper = dict(
    optimizer=dict(lr=0.01, momentum=0.9, type='SGD', weight_decay=0.0001))
param_scheduler = dict(
    by_epoch=True,
    gamma=0.1,
    milestones=[
        15,
    ],
    step_size=1,
    type='MultiStepLR')
randomness = dict(deterministic=False, seed=None)
resume = False
test_cfg = dict()
test_dataloader = dict(
    batch_size=32,
    collate_fn=dict(type='default_collate'),
    dataset=dict(
        ann_file='',
        data_prefix='test',
        data_root='data/custom_dataset',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(backend='pillow', edge='short', scale=256, type='ResizeEdge'),
            dict(crop_size=224, type='CenterCrop'),
            dict(type='PackInputs'),
        ],
        split='val',
        type='CustomDataset'),
    num_workers=5,
    persistent_workers=True,
    pin_memory=True,
    sampler=dict(shuffle=False, type='DefaultSampler'))
test_evaluator = dict(
    topk=(
        1,
        5,
    ), type='Accuracy')
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(backend='pillow', edge='short', scale=256, type='ResizeEdge'),
    dict(crop_size=224, type='CenterCrop'),
    dict(type='PackInputs'),
]
train_cfg = dict(by_epoch=True, max_epochs=300, val_interval=1)
train_dataloader = dict(
    batch_size=32,
    collate_fn=dict(type='default_collate'),
    dataset=dict(
        ann_file='',
        data_prefix='train',
        data_root='data/custom_dataset',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(backend='pillow', scale=224, type='RandomResizedCrop'),
            dict(direction='horizontal', prob=0.5, type='RandomFlip'),
            dict(type='PackInputs'),
        ],
        split='train',
        type='CustomDataset'),
    num_workers=5,
    persistent_workers=True,
    pin_memory=True,
    sampler=dict(shuffle=True, type='DefaultSampler'))
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(backend='pillow', scale=224, type='RandomResizedCrop'),
    dict(direction='horizontal', prob=0.5, type='RandomFlip'),
    dict(type='PackInputs'),
]
val_cfg = dict()
val_dataloader = dict(
    batch_size=32,
    collate_fn=dict(type='default_collate'),
    dataset=dict(
        ann_file='',
        data_prefix='val',
        data_root='data/custom_dataset',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(backend='pillow', edge='short', scale=256, type='ResizeEdge'),
            dict(crop_size=224, type='CenterCrop'),
            dict(type='PackInputs'),
        ],
        split='val',
        type='CustomDataset'),
    num_workers=5,
    persistent_workers=True,
    pin_memory=True,
    sampler=dict(shuffle=False, type='DefaultSampler'))
val_evaluator = dict(
    topk=(
        1,
        5,
    ), type='Accuracy')
vis_backends = [
    dict(type='LocalVisBackend'),
]
visualizer = dict(
    type='UniversalVisualizer', vis_backends=[
        dict(type='LocalVisBackend'),
    ])
work_dir = './work_dirs\\mobilenet-v2_finetune'

12/30 17:42:21 - mmengine - INFO - Distributed training is not used, all SyncBatchNorm (SyncBN) layers in the model will be automatically reverted to BatchNormXd layers if they are used.
12/30 17:42:21 - mmengine - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH   ) RuntimeInfoHook
(BELOW_NORMAL) LoggerHook
 --------------------
before_train:
(VERY_HIGH   ) RuntimeInfoHook
(NORMAL      ) IterTimerHook
(VERY_LOW    ) CheckpointHook
 --------------------
before_train_epoch:
(VERY_HIGH   ) RuntimeInfoHook
(NORMAL      ) IterTimerHook
(NORMAL      ) DistSamplerSeedHook
 --------------------
before_train_iter:
(VERY_HIGH   ) RuntimeInfoHook
(NORMAL      ) IterTimerHook
 --------------------
after_train_iter:
(VERY_HIGH   ) RuntimeInfoHook
(NORMAL      ) IterTimerHook
(BELOW_NORMAL) LoggerHook
(LOW         ) ParamSchedulerHook
(VERY_LOW    ) CheckpointHook
 --------------------
after_train_epoch:
(NORMAL      ) IterTimerHook
(LOW         ) ParamSchedulerHook
(VERY_LOW    ) CheckpointHook
 --------------------
before_val:
(VERY_HIGH   ) RuntimeInfoHook
 --------------------
before_val_epoch:
(NORMAL      ) IterTimerHook
 --------------------
before_val_iter:
(NORMAL      ) IterTimerHook
 --------------------
after_val_iter:
(NORMAL      ) IterTimerHook
(NORMAL      ) VisualizationHook
(BELOW_NORMAL) LoggerHook
 --------------------
after_val_epoch:
(VERY_HIGH   ) RuntimeInfoHook
(NORMAL      ) IterTimerHook
(BELOW_NORMAL) LoggerHook
(LOW         ) ParamSchedulerHook
(VERY_LOW    ) CheckpointHook
 --------------------
after_val:
(VERY_HIGH   ) RuntimeInfoHook
 --------------------
after_train:
(VERY_HIGH   ) RuntimeInfoHook
(VERY_LOW    ) CheckpointHook
 --------------------
before_test:
(VERY_HIGH   ) RuntimeInfoHook
 --------------------
before_test_epoch:
(NORMAL      ) IterTimerHook
 --------------------
before_test_iter:
(NORMAL      ) IterTimerHook
 --------------------
after_test_iter:
(NORMAL      ) IterTimerHook
(NORMAL      ) VisualizationHook
(BELOW_NORMAL) LoggerHook
 --------------------
after_test_epoch:
(VERY_HIGH   ) RuntimeInfoHook
(NORMAL      ) IterTimerHook
(BELOW_NORMAL) LoggerHook
 --------------------
after_test:
(VERY_HIGH   ) RuntimeInfoHook
 --------------------
after_run:
(BELOW_NORMAL) LoggerHook
 --------------------
Traceback (most recent call last):
  File "C:\Users\tilof\PycharmProjects\DeepLearningProjects\OpenMMLab\mmpretrain\tools\train.py", line 162, in <module>
    main()
  File "C:\Users\tilof\PycharmProjects\DeepLearningProjects\OpenMMLab\mmpretrain\tools\train.py", line 158, in main
    runner.train()
  File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmengine\runner\runner.py", line 1728, in train
    self._train_loop = self.build_train_loop(
  File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmengine\runner\runner.py", line 1527, in build_train_loop
    loop = EpochBasedTrainLoop(
  File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmengine\runner\loops.py", line 44, in __init__
    super().__init__(runner, dataloader)
  File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmengine\runner\base_loop.py", line 26, in __init__
    self.dataloader = runner.build_dataloader(
  File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmengine\runner\runner.py", line 1370, in build_dataloader
    dataset = DATASETS.build(dataset_cfg)
  File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmengine\registry\registry.py", line 570, in build
    return self.build_func(cfg, *args, **kwargs, registry=self)
  File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmengine\registry\build_functions.py", line 121, in build_from_cfg
    obj = obj_cls(**args)  # type: ignore
  File "C:\Users\tilof\AppData\Local\Programs\Python\Python310\lib\site-packages\mmpretrain\datasets\custom.py", line 207, in __init__
    super().__init__(
TypeError: BaseDataset.__init__() got an unexpected keyword argument 'split'

Environment

{'sys.platform': 'win32',
'Python': '3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 '
'64 bit (AMD64)]',
'CUDA available': False,
'numpy_random_seed': 2147483648,
'MSVC': 'Microsoft (R) C/C++-Optimierungscompiler Version 19.26.28806 für x64',
'GCC': 'n/a',
'PyTorch': '2.0.1+cu117',
'TorchVision': '0.15.2+cu117',
'OpenCV': '4.7.0',
'MMEngine': '0.10.2',
'MMCV': '2.1.0',
'MMPreTrain': '1.1.1+e95d9ac'}

Other information

No response

@leon-costa
Copy link

leon-costa commented Jan 1, 2024

I had the same problem following the guide How to Pretrain with Custom Dataset.

The problem is that the dataset you are overriding has a split argument (_base_/datasets/imagenet_bs32_pil_resize.py#L32) which doesn't work with the CustomDataset.

The solution I found was to copy all the arguments and add an extra _delete_=True (doc). Something like this (to repeat for other datasets):

train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='RandomResizedCrop', scale=224, backend='pillow'),
    dict(type='RandomFlip', prob=0.5, direction='horizontal'),
    dict(type='PackInputs'),
]

train_dataloader = dict(
    dataset=dict(
        type='CustomDataset',
        data_root=data_root,
        ann_file='',       # We assume you are using the sub-folder format without ann_file
        data_prefix='train',
        pipeline=train_pipeline,
        _delete_=True,
    ))

@Huy-Thai
Copy link

Huy-Thai commented Jan 8, 2024

Hi, @leon-costa,

I'm trying but not working,
Is there any way to fix the above problem?

@smoothumut
Copy link

Hi everyone, any update?
I am also having exact same problem with CustomDataset

@smoothumut
Copy link

I have made it worked.

@leon-costa 's solution and the link he gave
https://mmpretrain.readthedocs.io/en/latest/user_guides/config.html#ignore-some-fields-in-the-base-configs
helped me better understand the problem.

In my case I have removed the

'../base/datasets/imagenet_bs32_pil_resize.py', from my config's base,

then applied required dict settings (of course without split) for dataset into my config. Then it worked.
thanks all for guiding

@gjustin40
Copy link

@TNodeCode
Just remove split args of each dataloader config.

train_dataloader = dict(
    batch_size=32,
    collate_fn=dict(type='default_collate'),
    dataset=dict(
        ann_file='',
        data_prefix='train',
        data_root='data/custom_dataset',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(backend='pillow', scale=224, type='RandomResizedCrop'),
            dict(direction='horizontal', prob=0.5, type='RandomFlip'),
            dict(type='PackInputs'),
        ],
        split='train', <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<  remove (same as val_dataloader)
        type='CustomDataset'),
    num_workers=5,
    persistent_workers=True,
    pin_memory=True,
    sampler=dict(shuffle=True, type='DefaultSampler'))

split option is only used with datasets that have implemented the split feature, so if the split feature has not been specifically configured when using a custom dataset, it can be removed.

A prominent dataset that utilizes this feature is ImageNet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants