Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] dev-1.x分支训练det模型,_draw_border_map 函数出现 ValueError: could not broadcast input array from shape into shape #2036

Open
2 tasks done
yilong2001 opened this issue Apr 12, 2024 · 0 comments
Assignees

Comments

@yilong2001
Copy link

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

1.x branch https://github.com/open-mmlab/mmocr/tree/dev-1.x

Environment

环境变量:
sys.platform: linux
Python: 3.8.19 (default, Mar 20 2024, 19:58:24) [GCC 11.2.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 2147483648
GPU 0: NVIDIA A10
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 12.2, V12.2.91
GCC: gcc (Debian 10.2.1-6) 10.2.1 20210110
PyTorch: 1.10.2
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • LAPACK is enabled (usually provided by MKL)
  • NNPACK is enabled
  • CPU capability usage: AVX512
  • CUDA Runtime 11.3
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  • CuDNN 8.2
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.11.3
OpenCV: 4.9.0
MMEngine: 0.10.3
MMOCR: 1.0.1+1d3b1ca

Reproduces the problem - code sample

{
"metainfo": {
"category": [
{
"id": 0,
"name": "text"
}
],
"dataset_type": "TextDetDataset",
"task_name": "textdet"
},
"data_list": [
{
"sample_idx": 0,
"img_path": "xxx.png",
"height": 842,
"width": 595,
"seg_map": "gt-img-xxx.txt",
"instances": [ 很多 ]
}
]
}

Reproduces the problem - command or script

python tools/train.py configs/textdet/dbnetpp/mmocr_det_myconfig.py

Reproduces the problem - error message

_draw_border_map

错误出现的地方:
canvas[y_min_valid:y_max_valid + 1,
x_min_valid:x_max_valid + 1] = np.fmax(
1 - distance_map[y_min_valid - y_min:y_max_valid - y_max +
height, x_min_valid - x_min:x_max_valid -
x_max + width],
canvas[y_min_valid:y_max_valid + 1,
x_min_valid:x_max_valid + 1])
错误的原因:
y_min_valid - y_min 小于0 而且 绝对值小于 distance_map.shape[1]
x_min_valid - x_min 小于0 而且 绝对值小于 distance_map.shape[0]


此问题规避之后,会有新问题:

File "tools/train.py", line 114, in
main()
File "tools/train.py", line 110, in main
runner.train()
File "/home/beeservice/.conda/envs/open-mmlab/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1777, in train
model = self.train_loop.run() # type: ignore
File "/home/beeservice/.conda/envs/open-mmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run
self.run_epoch()
File "/home/beeservice/.conda/envs/open-mmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 112, in run_epoch
self.run_iter(idx, data_batch)
File "/home/beeservice/.conda/envs/open-mmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 128, in run_iter
outputs = self.runner.model.train_step(
File "/home/beeservice/.conda/envs/open-mmlab/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 114, in train_step
losses = self._run_forward(data, mode='loss') # type: ignore
File "/home/beeservice/.conda/envs/open-mmlab/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 361, in _run_forward
results = self(**data, mode=mode)
File "/home/beeservice/.conda/envs/open-mmlab/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/projects/mmocr/mmocr/models/textdet/detectors/base.py", line 72, in forward
return self.loss(inputs, data_samples)
File "/opt/projects/mmocr/mmocr/models/textdet/detectors/single_stage_text_detector.py", line 76, in loss
return self.det_head.loss(inputs, data_samples)
File "/opt/projects/mmocr/mmocr/models/textdet/heads/db_head.py", line 139, in loss
losses = self.module_loss(outs, batch_data_samples)
File "/home/beeservice/.conda/envs/open-mmlab/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/projects/mmocr/mmocr/models/textdet/module_losses/db_module_loss.py", line 79, in forward
gt_shrinks, gt_shrink_masks, gt_thrs, gt_thr_masks = self.get_targets(
File "/opt/projects/mmocr/mmocr/models/textdet/module_losses/db_module_loss.py", line 237, in get_targets
gt_shrinks = torch.cat(gt_shrinks)
RuntimeError: Sizes of tensors must match except in dimension 0. Expected size 722 but got size 720 for tensor number 1 in the list.

Additional information

data_textdet_train = dict(
type="OCRDataset",
data_root=data_root,
ann_file="mmocrdet_anno.json",
filter_cfg=dict(filter_empty_gt=True, min_size=32),
pipeline=None,
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants