You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cd mmocr-1.0.1
python tools/train.py configs/kie/sdmgr/sdmgr_unet16_60e_wildreceipt.py
Reproduces the problem - error message
mmocr内部模型sdmgr:
Traceback (most recent call last):
File "tools/train-Copy1.py", line 118, in
main()
File "tools/train-Copy1.py", line 114, in main
runner.train()
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmengine/runner/runner.py", line 1777, in train
model = self.train_loop.run() # type: ignore
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmengine/runner/loops.py", line 96, in run
self.run_epoch()
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmengine/runner/loops.py", line 112, in run_epoch
self.run_iter(idx, data_batch)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmengine/runner/loops.py", line 129, in run_iter
data_batch, optim_wrapper=self.runner.optim_wrapper)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmengine/model/base_model/base_model.py", line 114, in train_step
losses = self._run_forward(data, mode='loss') # type: ignore
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmengine/model/base_model/base_model.py", line 361, in _run_forward
results = self(**data, mode=mode)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmocr/models/kie/extractors/sdmgr.py", line 120, in forward
return self.loss(inputs, data_samples, **kwargs)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmocr/models/kie/extractors/sdmgr.py", line 144, in loss
[data_sample.gt_instances.bboxes for data_sample in data_samples])
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmocr/models/kie/extractors/sdmgr.py", line 83, in extract_feat
feats = self.maxpool(self.extractor([x], bbox2roi(gt_bboxes)))
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmdet/models/roi_heads/roi_extractors/single_level_roi_extractor.py", line 95, in forward
return self.roi_layers[0](feats[0], rois)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmcv/ops/roi_align.py", line 211, in forward
self.sampling_ratio, self.pool_mode, self.aligned)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmcv/ops/roi_align.py", line 101, in forward
aligned=ctx.aligned)
RuntimeError: roi_align_forward_impl: implementation for device xla:1 not found.
类似还有drrg模型:
RuntimeError: roi_align_rotated_forward_impl: implementation for device xla:1 not found.
mask-rcnn模型:
RuntimeError: nms_impl: implementation for device xla:1 not found.
The text was updated successfully, but these errors were encountered:
GenerallyCovetous
changed the title
[Bug] RuntimeError: xxx_impl: implementation for device xla:1 not found.
[Bug] NPU环境:RuntimeError: xxx_impl: implementation for device xla:1 not found.
Mar 20, 2024
Prerequisite
Environment
OrderedDict([('sys.platform', 'linux'), ('Python', '3.7.10 | packaged by conda-forge | (default, Oct 13 2021, 22:05:51) [GCC 9.4.0]'), ('CUDA available', False), ('MUSA available', False), ('numpy_random_seed', 2147483648), ('GCC', 'gcc (GCC) 7.3.0'), ('PyTorch', '1.11.0a0+gitbc2c6ed'), ('PyTorch compiling details', 'PyTorch built with:\n - GCC 7.5\n - C++ Version: 201402\n - OpenMP 201511 (a.k.a. OpenMP 4.5)\n - LAPACK is enabled (usually provided by MKL)\n - NNPACK is enabled\n - CPU capability usage: NO AVX\n - Build settings: BLAS_INFO=generic, BUILD_TYPE=Release, CXX_COMPILER=/opt/buildtools/gcc-7.5.0/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -DMISSING_ARM_VST1 -DMISSING_ARM_VLD1 -Wno-stringop-overflow, LAPACK_INFO=generic, TORCH_VERSION=1.11.0, USE_CUDA=OFF, USE_CUDNN=OFF, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, \n'), ('TorchVision', '0.12.0'), ('OpenCV', '4.3.0'), ('MMEngine', '0.10.3'), ('MMCV', '2.0.1'), ('MMCV Compiler', 'GCC 7.3'), ('MMCV CUDA Compiler', 'not available')])
Reproduces the problem - code sample
python tools/train.py configs/kie/sdmgr/sdmgr_unet16_60e_wildreceipt.py
Reproduces the problem - command or script
cd mmocr-1.0.1
python tools/train.py configs/kie/sdmgr/sdmgr_unet16_60e_wildreceipt.py
Reproduces the problem - error message
mmocr内部模型sdmgr:
Traceback (most recent call last):
File "tools/train-Copy1.py", line 118, in
main()
File "tools/train-Copy1.py", line 114, in main
runner.train()
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmengine/runner/runner.py", line 1777, in train
model = self.train_loop.run() # type: ignore
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmengine/runner/loops.py", line 96, in run
self.run_epoch()
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmengine/runner/loops.py", line 112, in run_epoch
self.run_iter(idx, data_batch)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmengine/runner/loops.py", line 129, in run_iter
data_batch, optim_wrapper=self.runner.optim_wrapper)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmengine/model/base_model/base_model.py", line 114, in train_step
losses = self._run_forward(data, mode='loss') # type: ignore
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmengine/model/base_model/base_model.py", line 361, in _run_forward
results = self(**data, mode=mode)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmocr/models/kie/extractors/sdmgr.py", line 120, in forward
return self.loss(inputs, data_samples, **kwargs)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmocr/models/kie/extractors/sdmgr.py", line 144, in loss
[data_sample.gt_instances.bboxes for data_sample in data_samples])
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmocr/models/kie/extractors/sdmgr.py", line 83, in extract_feat
feats = self.maxpool(self.extractor([x], bbox2roi(gt_bboxes)))
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmdet/models/roi_heads/roi_extractors/single_level_roi_extractor.py", line 95, in forward
return self.roi_layers[0](feats[0], rois)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmcv/ops/roi_align.py", line 211, in forward
self.sampling_ratio, self.pool_mode, self.aligned)
File "/home/ma-user/anaconda3/envs/PyTorch-1.11.0/lib/python3.7/site-packages/mmcv/ops/roi_align.py", line 101, in forward
aligned=ctx.aligned)
RuntimeError: roi_align_forward_impl: implementation for device xla:1 not found.
类似还有drrg模型:
RuntimeError: roi_align_rotated_forward_impl: implementation for device xla:1 not found.
mask-rcnn模型:
RuntimeError: nms_impl: implementation for device xla:1 not found.
Additional information
dpnet、dpnetpp、master等模型可以正常进行训练,import mmcv; import mmcv.ops也没有报错,但是在训练上述报错模型的时候就不行了
The text was updated successfully, but these errors were encountered: