Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Error Encountered with mmengine Dependency Involving JSON and Time Modules #1523

Open
2 tasks done
Duguce opened this issue Apr 1, 2024 · 0 comments
Open
2 tasks done
Labels
bug Something isn't working

Comments

@Duguce
Copy link

Duguce commented Apr 1, 2024

Prerequisite

Environment

OrderedDict([('sys.platform', 'linux'), ('Python', '3.10.14 (main, Mar 21 2024, 16:24:04) [GCC 11.2.0]'), ('CUDA available', True), ('MUSA available', False), ('numpy_random_seed', 2147483648), ('GPU 0,1,2,3,4,5,6,7', 'NVIDIA A800-SXM4-40GB'), ('CUDA_HOME', '/usr/local/cuda-11.8'), ('NVCC', 'Cuda compilation tools, release 11.8, V11.8.89'), ('GCC', 'gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)'), ('PyTorch', '2.1.2+cu121'), ('PyTorch compiling details', 'PyTorch built with:\n - GCC 9.3\n - C++ Version: 201703\n - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications\n - Intel(R) MKL-DNN v3.1.1 (Git Hash 64f6bcbcbab628e96f33a62c3e975f8535a7bde4)\n - OpenMP 201511 (a.k.a. OpenMP 4.5)\n - LAPACK is enabled (usually provided by MKL)\n - NNPACK is enabled\n - CPU capability usage: AVX512\n - CUDA Runtime 12.1\n - NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90\n - CuDNN 8.9.2\n - Magma 2.6.1\n - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-unused-private-field -Wno-aligned-allocation-unavailable -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.1.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, \n'), ('TorchVision', '0.16.2+cu121'), ('OpenCV', '4.9.0'), ('MMEngine', '0.10.3')])

Reproduces the problem - code sample

import torch
from datasets import load_dataset
from mmengine.dataset import DefaultSampler
from mmengine.hooks import (CheckpointHook, DistSamplerSeedHook, IterTimerHook,
                            LoggerHook, ParamSchedulerHook)
from mmengine.optim import AmpOptimWrapper, CosineAnnealingLR, LinearLR
from peft import LoraConfig
from torch.optim import AdamW
from transformers import (AutoModelForCausalLM, AutoTokenizer,
                          BitsAndBytesConfig)

from xtuner.dataset import process_hf_dataset
from xtuner.dataset.collate_fns import default_collate_fn
from xtuner.dataset.map_fns import alpaca_map_fn, template_map_fn_factory
from xtuner.engine.hooks import (DatasetInfoHook, EvaluateChatHook,
                                 VarlenAttnArgsToMessageHubHook)
from xtuner.engine.runner import TrainLoop
from xtuner.model import SupervisedFinetune
from xtuner.utils import PROMPT_TEMPLATE, SYSTEM_TEMPLATE

import time
from mmengine.visualization.vis_backend import WandbVisBackend
from mmengine.visualization.visualizer import Visualizer

Reproduces the problem - command or script

CUDA_VISIBLE_DEVICES=7 xtuner train /mnt/data61/qingchen/codes/OpenJudge/yqc/ft/qwen1_5_0_5b_chat_qlora/qwen1_5_0_5b_chat_qlora_alpaca_e3_copy.py --deepspeed deepspeed_zero2

Reproduces the problem - error message

[2024-04-01 11:00:04,308] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-04-01 11:00:11,840] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Traceback (most recent call last):
  File "/mnt/data61/qingchen/envs/xtuner-env/lib/python3.10/site-packages/xtuner/tools/train.py", line 307, in <module>
    main()
  File "/mnt/data61/qingchen/envs/xtuner-env/lib/python3.10/site-packages/xtuner/tools/train.py", line 300, in main
    runner = RUNNERS.build(cfg)
  File "/mnt/data61/qingchen/envs/xtuner-env/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
    return self.build_func(cfg, *args, **kwargs, registry=self)
  File "/mnt/data61/qingchen/envs/xtuner-env/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 196, in build_runner_from_cfg
    runner = runner_cls.from_cfg(args)  # type: ignore
  File "/mnt/data61/qingchen/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 422, in from_cfg
    cfg = copy.deepcopy(cfg)
  File "/mnt/data61/qingchen/envs/xtuner-env/lib/python3.10/copy.py", line 153, in deepcopy
    y = copier(memo)
  File "/mnt/data61/qingchen/envs/xtuner-env/lib/python3.10/site-packages/mmengine/config/config.py", line 1527, in __deepcopy__
    super(Config, other).__setattr__(key, copy.deepcopy(value, memo))
  File "/mnt/data61/qingchen/envs/xtuner-env/lib/python3.10/copy.py", line 153, in deepcopy
    y = copier(memo)
  File "/mnt/data61/qingchen/envs/xtuner-env/lib/python3.10/site-packages/mmengine/config/config.py", line 144, in __deepcopy__
    other[copy.deepcopy(key, memo)] = copy.deepcopy(value, memo)
  File "/mnt/data61/qingchen/envs/xtuner-env/lib/python3.10/copy.py", line 161, in deepcopy
    rv = reductor(4)
TypeError: cannot pickle 'module' object

Additional information

When I use xtuner, I encounter an error related to mmengine that seems to involve built-in packages like json and time. For example, when I comment out import time in the code above, it seems that the error doesn't occur anymore.

@Duguce Duguce added the bug Something isn't working label Apr 1, 2024
@Duguce Duguce changed the title [Bug] [Bug] Error Encountered with mmengine Dependency Involving JSON and Time Modules Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant