Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to import dataset 'imagenet' #2524

Open
leo-smi opened this issue Sep 27, 2023 · 9 comments
Open

Failed to import dataset 'imagenet' #2524

leo-smi opened this issue Sep 27, 2023 · 9 comments
Assignees

Comments

@leo-smi
Copy link

leo-smi commented Sep 27, 2023

I'm running the tutorial:

otx build MobileNetV2-ATSS --train-data-roots data/wgisd

But I get the error:

 raise DatasetImportError(f"Failed to import dataset '{format}' at '{path}'.") from e
datumaro.components.errors.DatasetImportError: 
Failed to import dataset 'imagenet' at 
'E:\Meu Laptop\codigos python\openvino_projects\training_extensions\training_extensions\data\wgisd'.

My dataset folder looks like this:

image

@sungmanc
Copy link
Contributor

Could you follow the exact dataset format like below? In the current status, OTX couldn't properly detect the dataset format.
image

@sungmanc sungmanc self-assigned this Sep 27, 2023
@leo-smi
Copy link
Author

leo-smi commented Sep 29, 2023

Ok, now my path tree for training data is:

wgsid/
├── annotations/
│   ├── instances_train.json
│   └── instances_val.json
├── images/
│   ├── CDY_2015.jpg
│   ├── CDY_2016.jpg
│   ├── CDY_2017.jpg
│   ├── CDY_2018.jpg
│   ├── CDY_20180427_152724818_BURST000_COVER_TOP.jpg
│   ├── CDY_20180427_152823935_BURST000_COVER_TOP.jpg
│   ├── CDY_20180427_152937457_BURST000_COVER_TOP.jpg
│   ├── CDY_20180427_153021423_BURST001.jpg
│   ├── CDY_20180427_153126820_BURST000_COVER_TOP.jpg
│   ├── CDY_20180427_153126820_BURST001.jpg
│   ├── CDY_20180427_153144437_BURST000_COVER_TOP.jpg
│   ├── CDY_20180427_153152483_BURST001.jpg
│   ├── CDY_20180427_153201477_BURST000_COVER_TOP.jpg
│   ├── ...

After running otx build MobileNetV2-ATSS --train-data-roots wgisd I got:

(.otx) E:\Meu Laptop\codigos python\openvino_projects\training_extensions>otx build MobileNetV2-ATSS --train-data-roots wgisd
[*] Workspace Path: otx-workspace-DETECTION
[*] Load Model Template ID: Custom_Object_Detection_Gen3_ATSS
[*] Load Model Name: MobileNetV2-ATSS
E:\Ambientes_Python\OpenVino_Envs\.otx\lib\site-packages\mmcv\__init__.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
  warnings.warn(
2023-09-28 23:26:38,133 | WARNING : Duplicate key is detected among bases [{'model'}]
2023-09-28 23:26:38,133 | WARNING : Duplicate key is detected among bases [{'model', 'load_from', 'task', 'resume_from', 'checkpoint_config'}]
In the CLI, Update ignore to false in model configuration.
[*]     - Updated: otx-workspace-DETECTION\model.py
[*]     - Updated: otx-workspace-DETECTION\data_pipeline.py
[*]     - Updated: otx-workspace-DETECTION\tile_pipeline.py
[*]     - Updated: otx-workspace-DETECTION\deployment.py
[*]     - Updated: otx-workspace-DETECTION\hpo_config.yaml
[*]     - Updated: otx-workspace-DETECTION\compression_config.json
[*] Update data configuration file to: otx-workspace-DETECTION\data.yaml

My data.yaml is:

data:
  train:
    ann-files: null
    data-roots: E:\Meu Laptop\codigos python\openvino_projects\training_extensions\wgisd
  val:
    ann-files: null
    data-roots: E:\Meu Laptop\codigos python\openvino_projects\training_extensions\otx-workspace-DETECTION\splitted_dataset\val
  test:
    ann-files: null
    data-roots: null
  unlabeled:
    file-list: null
    data-roots: null

But after running otx train --output ../outputs --workspace ../outputs/logs --gpus 1 I got:

(.otx) E:\Meu Laptop\codigos python\openvino_projects\training_extensions\otx-workspace-DETECTION>otx train  --output ../outputs --workspace ../outputs/logs --gpus 1
Traceback (most recent call last):
  File "C:\Users\leand\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\leand\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "E:\Ambientes_Python\OpenVino_Envs\.otx\Scripts\otx.exe\__main__.py", line 7, in <module>
  File "E:\Meu Laptop\codigos python\openvino_projects\training_extensions\src\otx\cli\tools\cli.py", line 77, in main
    results = globals()[f"otx_{name}"]()
  File "E:\Meu Laptop\codigos python\openvino_projects\training_extensions\src\otx\cli\tools\train.py", line 192, in main
    return train(exit_stack)
  File "E:\Meu Laptop\codigos python\openvino_projects\training_extensions\src\otx\cli\tools\train.py", line 203, in train
    config_manager.configure_template()
  File "E:\Meu Laptop\codigos python\openvino_projects\training_extensions\src\otx\cli\manager\config_manager.py", line 173, in configure_template
    self.train_type = self._get_train_type()
  File "E:\Meu Laptop\codigos python\openvino_projects\training_extensions\src\otx\cli\manager\config_manager.py", line 238, in _get_train_type
    self._configure_train_type()
  File "E:\Meu Laptop\codigos python\openvino_projects\training_extensions\src\otx\cli\manager\config_manager.py", line 318, in _configure_train_type
    raise ValueError(
ValueError: train-data-roots isn't a directory, it doesn't exist or it is empty. Please, check command line and directory path.

If the path is the problem is that because the spaces on it? If I change the path i loose all the environment configuration and I have to start again, is there a way to fix this on the train file to detect whitespaces on the training path? I think the path should be:

E:\Meu_Laptop\codigos_python\openvino_projects\training_extensions

@leo-smi
Copy link
Author

leo-smi commented Sep 29, 2023

Ok, I changed the path ans intalled the environments again. Now my data.yaml is:

data:
  train:
    ann-files: null
    data-roots: E:\sandbox\training_extensions\otx-workspace-DETECTION\splitted_dataset\train
  val:
    ann-files: null
    data-roots: E:\sandbox\training_extensions\otx-workspace-DETECTION\splitted_dataset\val
  test:
    ann-files: null
    data-roots: null
  unlabeled:
    file-list: null
    data-roots: null

Running the commands...:

(otx) ...$ cd otx-workspace-DETECTION/
(otx) ...$ otx train  --output ../outputs --workspace ../outputs/logs --gpus 1

...I got the error:

(.otx) E:\sandbox\training_extensions\otx-workspace-DETECTION>otx train  --output ../outputs --workspace ../outputs/logs --gpus 1
Traceback (most recent call last):
  File "C:\Users\leand\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\leand\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "E:\Ambientes_Python\OpenVino_Envs\.otx\Scripts\otx.exe\__main__.py", line 7, in <module>
  File "E:\sandbox\training_extensions\src\otx\cli\tools\cli.py", line 77, in main
    results = globals()[f"otx_{name}"]()
  File "E:\sandbox\training_extensions\src\otx\cli\tools\train.py", line 192, in main
    return train(exit_stack)
  File "E:\sandbox\training_extensions\src\otx\cli\tools\train.py", line 203, in train
    config_manager.configure_template()
  File "E:\sandbox\training_extensions\src\otx\cli\manager\config_manager.py", line 173, in configure_template
    self.train_type = self._get_train_type()
  File "E:\sandbox\training_extensions\src\otx\cli\manager\config_manager.py", line 238, in _get_train_type
    self._configure_train_type()
  File "E:\sandbox\training_extensions\src\otx\cli\manager\config_manager.py", line 318, in _configure_train_type
    raise ValueError(
ValueError: train-data-roots isn't a directory, it doesn't exist or it is empty. Please, check command line and directory path.

My E:\sandbox\training_extensions\otx-workspace-DETECTION folder tree is:

E:\sandbox\training_extensions\otx-workspace-DETECTION/
├── compression_config.json
├── configuration.yaml
├── data.yaml
├── data_pipeline.py
├── deployment.py
├── hpo_config.yaml
├── model.py
├── path_tree.py
├── splitted_dataset/
│   ├── train/
│   │   ├── annotations/
│   │   │   ├── image_info_train.json
│   │   │   ├── instances_train.json
│   │   │   └── stuff_train.json
│   │   └── images/
│   │       └── train/
│   │           ├── CDY_2016.jpg
│   │           ├── CDY_2017.jpg
│   │           ├── CDY_2018.jpg
│   │           ├── ...
│   │           ├── SYH_2017-04-27_1340.jpg
│   │           ├── SYH_2017-04-27_1342.jpg
│   │           └── SYH_2017-04-27_1344.jpg
│   └── val/
│       ├── annotations/
│       │   ├── image_info_val.json
│       │   ├── instances_val.json
│       │   └── stuff_val.json
│       └── images/
│           └── val/
│               ├── CDY_2015.jpg
│               ├── CDY_20180427_152724818_BURST000_COVER_TOP.jpg
│               ├── CDY_20180427_153615626_BURST000_COVER_TOP.jpg
│               ├── SVB_20180427_152010406_HDR.jpg
│               ├── ...
│               ├── SYH_2017-04-27_1320.jpg
│               ├── SYH_2017-04-27_1322.jpg
│               └── SYH_2017-04-27_1333.jpg
├── template.yaml
└── tile_pipeline.py

Partial conclusions:

  • The whitespaces was not the problem

@sungmanc
Copy link
Contributor

sungmanc commented Oct 4, 2023

Could you please use the below CLI? It seems that auto-splitted dataset didn't works

otx build MobileNetV2-ATSS --train-data-roots wgisd --val-data-roots wgisd

@leo-smi
Copy link
Author

leo-smi commented Oct 4, 2023

Running otx build MobileNetV2-ATSS --train-data-roots wgisd --val-data-roots wgisd:

(.otx) E:\sandbox\training_extensions>otx build MobileNetV2-ATSS --train-data-roots wgisd --val-data-roots wgisd
[*] Workspace Path: otx-workspace-DETECTION
[*] Load Model Template ID: Custom_Object_Detection_Gen3_ATSS
[*] Load Model Name: MobileNetV2-ATSS
E:\Ambientes_Python\OpenVino_Envs\.otx\lib\site-packages\mmcv\__init__.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
  warnings.warn(
2023-10-04 10:32:57,277 | WARNING : Duplicate key is detected among bases [{'model'}]
2023-10-04 10:32:57,277 | WARNING : Duplicate key is detected among bases [{'resume_from', 'load_from', 'model', 'task', 'checkpoint_config'}]
In the CLI, Update ignore to false in model configuration.
[*]     - Updated: otx-workspace-DETECTION\model.py
[*]     - Updated: otx-workspace-DETECTION\data_pipeline.py
[*]     - Updated: otx-workspace-DETECTION\tile_pipeline.py
[*]     - Updated: otx-workspace-DETECTION\deployment.py
[*]     - Updated: otx-workspace-DETECTION\hpo_config.yaml
[*]     - Updated: otx-workspace-DETECTION\compression_config.json
[*] Update data configuration file to: otx-workspace-DETECTION\data.yaml

The otx-workspace-DETECTION path tree. Notice the folder splitted_dataset was not created due the flag --val-data-roots wgisd:

otx-workspace-DETECTION/
├── compression_config.json
├── configuration.yaml
├── data.yaml
├── data_pipeline.py
├── deployment.py
├── hpo_config.yaml
├── model.py
├── outputs/
│   └── logs/
├── path_tree.py
├── template.yaml
└── tile_pipeline.py

Training with otx train --output ./outputs --workspace ./outputs/logs --gpus 1:

(.otx) E:\sandbox\training_extensions\otx-workspace-DETECTION>otx train  --output ./outputs --workspace ./outputs/logs --gpus 1
Traceback (most recent call last):
  File "C:\Users\leand\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\leand\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "E:\Ambientes_Python\OpenVino_Envs\.otx\Scripts\otx.exe\__main__.py", line 7, in <module>
  File "E:\sandbox\training_extensions\src\otx\cli\tools\cli.py", line 77, in main
    results = globals()[f"otx_{name}"]()
  File "E:\sandbox\training_extensions\src\otx\cli\tools\train.py", line 192, in main
    return train(exit_stack)
  File "E:\sandbox\training_extensions\src\otx\cli\tools\train.py", line 203, in train
    config_manager.configure_template()
  File "E:\sandbox\training_extensions\src\otx\cli\manager\config_manager.py", line 173, in configure_template
    self.train_type = self._get_train_type()
  File "E:\sandbox\training_extensions\src\otx\cli\manager\config_manager.py", line 238, in _get_train_type
    self._configure_train_type()
  File "E:\sandbox\training_extensions\src\otx\cli\manager\config_manager.py", line 318, in _configure_train_type
    raise ValueError(
ValueError: train-data-roots isn't a directory, it doesn't exist or it is empty. Please, check command line and directory path.

The data.yaml file:

data:
  train:
    ann-files: null
    data-roots: E:\sandbox\training_extensions\wgisd
  val:
    ann-files: null
    data-roots: E:\sandbox\training_extensions\wgisd
  test:
    ann-files: null
    data-roots: null
  unlabeled:
    file-list: null
    data-roots: null

@sungmanc
Copy link
Contributor

It seems that your wgisd dataset format is little bit different, could you use the OTX CLI by using tests/assets/car_tree_bug for your detection task?

otx build MobileNetV2-ATSS --train-data-roots tests/assets/car_tree_bug --val-data-roots tests/assets/car_tree_bug
If it works, then you need to correct the dataset format

@leo-smi
Copy link
Author

leo-smi commented Oct 13, 2023

otx build MobileNetV2-ATSS --train-data-roots tests/assets/car_tree_bug --val-data-roots tests/assets/car_tree_bug
(.otx) E:\sandbox\training_extensions>otx build MobileNetV2-ATSS --train-data-roots tests/assets/car_tree_bug --val-data-roots tests/assets/car_tree_bug
[*] Workspace Path: otx-workspace-DETECTION
[*] Load Model Template ID: Custom_Object_Detection_Gen3_ATSS
[*] Load Model Name: MobileNetV2-ATSS
E:\Ambientes_Python\OpenVino_Envs\.otx\lib\site-packages\mmcv\__init__.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
  warnings.warn(
2023-10-12 22:36:57,422 | WARNING : Duplicate key is detected among bases [{'model'}]
2023-10-12 22:36:57,422 | WARNING : Duplicate key is detected among bases [{'checkpoint_config', 'load_from', 'resume_from', 'task', 'model'}]
In the CLI, Update ignore to false in model configuration.
[*]     - Updated: otx-workspace-DETECTION\model.py
[*]     - Updated: otx-workspace-DETECTION\data_pipeline.py
[*]     - Updated: otx-workspace-DETECTION\tile_pipeline.py
[*]     - Updated: otx-workspace-DETECTION\deployment.py
[*]     - Updated: otx-workspace-DETECTION\hpo_config.yaml
[*]     - Updated: otx-workspace-DETECTION\compression_config.json
[*] Update data configuration file to: otx-workspace-DETECTION\data.yaml

the data.yaml file:

data:
  train:
    ann-files: null
    data-roots: E:\sandbox\training_extensions\tests\assets\car_tree_bug
  val:
    ann-files: null
    data-roots: E:\sandbox\training_extensions\tests\assets\car_tree_bug
  test:
    ann-files: null
    data-roots: null
  unlabeled:
    file-list: null
    data-roots: null


(otx) ...$ cd otx-workspace-DETECTION/
(otx) ...$ otx train  --output ../outputs --workspace ../outputs/logs --gpus 1
(.otx) E:\sandbox\training_extensions\otx-workspace-DETECTION>otx train  --output ../outputs --workspace ../outputs/logs --gpus 1
Traceback (most recent call last):
  File "C:\Users\leand\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\leand\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "E:\Ambientes_Python\OpenVino_Envs\.otx\Scripts\otx.exe\__main__.py", line 7, in <module>
  File "E:\sandbox\training_extensions\src\otx\cli\tools\cli.py", line 77, in main
    results = globals()[f"otx_{name}"]()
  File "E:\sandbox\training_extensions\src\otx\cli\tools\train.py", line 192, in main
    return train(exit_stack)
  File "E:\sandbox\training_extensions\src\otx\cli\tools\train.py", line 203, in train
    config_manager.configure_template()
  File "E:\sandbox\training_extensions\src\otx\cli\manager\config_manager.py", line 173, in configure_template
    self.train_type = self._get_train_type()
  File "E:\sandbox\training_extensions\src\otx\cli\manager\config_manager.py", line 238, in _get_train_type
    self._configure_train_type()
  File "E:\sandbox\training_extensions\src\otx\cli\manager\config_manager.py", line 318, in _configure_train_type
    raise ValueError(
ValueError: train-data-roots isn't a directory, it doesn't exist or it is empty. Please, check command line and directory path.

@RadaevKirill
Copy link

I had the same problem. To solve this, simply pass the --train-data-roots and --val-data-toors flags.
Thus, for the example described in the documentation, the start of training will be as follows:
otx train --output ../outputs --workspace ../outputs/logs --train-data-roots /splitted_dataset/train --val-data-roots /splitted_dataset/val

@leo-smi
Copy link
Author

leo-smi commented Oct 13, 2023

Unfortunately It didn't work :(

(.otx) E:\sandbox\training_extensions\otx-workspace-DETECTION>otx train --output ../outputs --workspace ../outputs/logs --train-data-roots /splitted_dataset/train --val-data-roots /splitted_dataset/val
Traceback (most recent call last):
  File "C:\Users\leand\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\leand\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "E:\Ambientes_Python\OpenVino_Envs\.otx\Scripts\otx.exe\__main__.py", line 7, in <module>
  File "E:\sandbox\training_extensions\src\otx\cli\tools\cli.py", line 77, in main
    results = globals()[f"otx_{name}"]()
  File "E:\sandbox\training_extensions\src\otx\cli\tools\train.py", line 192, in main
    return train(exit_stack)
  File "E:\sandbox\training_extensions\src\otx\cli\tools\train.py", line 203, in train
    config_manager.configure_template()
  File "E:\sandbox\training_extensions\src\otx\cli\manager\config_manager.py", line 173, in configure_template
    self.train_type = self._get_train_type()
  File "E:\sandbox\training_extensions\src\otx\cli\manager\config_manager.py", line 238, in _get_train_type
    self._configure_train_type()
  File "E:\sandbox\training_extensions\src\otx\cli\manager\config_manager.py", line 318, in _configure_train_type
    raise ValueError(
ValueError: train-data-roots isn't a directory, it doesn't exist or it is empty. Please, check command line and directory path.

Maybe it's because the ONNX training support for windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants