Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] MMOCRInferencer cropped image 0 size which breaks the recognition #2020

Open
2 tasks done
juvebogdan opened this issue Jan 17, 2024 · 0 comments
Open
2 tasks done
Assignees

Comments

@juvebogdan
Copy link

Prerequisite

Task

I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmocr

Environment

sys.platform: linux
Python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
CUDA available: True
numpy_random_seed: 2147483648
GPU 0: Tesla T4
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 12.2, V12.2.140
GCC: x86_64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
PyTorch: 2.0.0+cu117
PyTorch compiling details: PyTorch built with:

  • GCC 9.3
  • C++ Version: 201703
  • Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • LAPACK is enabled (usually provided by MKL)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.7
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  • CuDNN 8.5
  • Magma 2.6.1
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.15.1+cu117
OpenCV: 4.8.0
MMEngine: 0.10.2
MMOCR: 1.0.1+

Reproduces the problem - code sample

!pip install torch==2.0.0 torchvision==0.15.1
!pip install -U openmim
!mim install "mmengine>=0.7.1,<1.1.0"
!mim install "mmcv>=2.0.0rc4,<2.1.0"
!mim install "mmdet>=3.0.0rc5,<3.2.0"
!mim install mmocr

!git clone https://github.com/open-mmlab/mmocr.git
%cd mmocr
!pip install -v -e .

from mmocr.apis import MMOCRInferencer
infer = MMOCRInferencer(det='psenet', rec='robustscanner')
result = infer('/content/drit_page-0005.jpg', save_vis=True, return_vis=True)
print(result['predictions'])

Reproduces the problem - command or script

!pip install torch==2.0.0 torchvision==0.15.1
!pip install -U openmim
!mim install "mmengine>=0.7.1,<1.1.0"
!mim install "mmcv>=2.0.0rc4,<2.1.0"
!mim install "mmdet>=3.0.0rc5,<3.2.0"
!mim install mmocr

!git clone https://github.com/open-mmlab/mmocr.git
%cd mmocr
!pip install -v -e .

from mmocr.apis import MMOCRInferencer
infer = MMOCRInferencer(det='psenet', rec='robustscanner')
result = infer('/content/drit_page-0005.jpg', save_vis=True, return_vis=True)
print(result['predictions'])

Reproduces the problem - error message


error Traceback (most recent call last)
in <cell line: 3>()
1 from mmocr.apis import MMOCRInferencer
2 infer = MMOCRInferencer(det='psenet', rec='robustscanner')
----> 3 result = infer('/content/drit_page-0005.jpg', save_vis=True, return_vis=True)
4 print(result['predictions'])

15 frames
/usr/local/lib/python3.10/dist-packages/mmcv/image/geometric.py in imresize(img, size, return_scale, interpolation, out, backend)
114 resized_img = np.array(pil_image)
115 else:
--> 116 resized_img = cv2.resize(
117 img, size, dst=out, interpolation=cv2_interp_codes[interpolation])
118 if not return_scale:

error: OpenCV(4.8.0) /io/opencv/modules/imgproc/src/resize.cpp:4062: error: (-215:Assertion failed) !ssize.empty() in function 'resize'

Additional information


drit_page-0005

When i inference with this image i get the error. I believe there is a problem with detection. When i print crop_img(img, quad) shape i get:
(57, 239, 3)
(57, 198, 3)
(57, 251, 3)
(73, 137, 3)
(49, 75, 3)
(58, 168, 3)
(58, 109, 3)
(81, 267, 3)
(83, 304, 3)
(74, 260, 3)
(66, 221, 3)
(82, 153, 3)
(66, 279, 3)
(58, 139, 3)
(58, 251, 3)
(74, 295, 3)
(58, 150, 3)
(58, 227, 3)
(66, 232, 3)
(66, 144, 3)
(66, 268, 3)
(66, 214, 3)
(66, 239, 3)
(66, 185, 3)
(50, 152, 3)
(41, 58, 3)
(41, 64, 3)
(81, 272, 3)
(66, 173, 3)
(58, 151, 3)
(66, 297, 3)
(82, 259, 3)
(33, 48, 3)
(66, 250, 3)
(58, 222, 3)
(58, 269, 3)
(66, 197, 3)
(66, 221, 3)
(41, 65, 3)
(44, 73, 3)
(66, 273, 3)
(74, 284, 3)
(74, 243, 3)
(66, 262, 3)
(49, 147, 3)
(75, 172, 3)
(66, 344, 3)
(66, 221, 3)
(58, 228, 3)
(58, 168, 3)
(57, 157, 3)
(66, 273, 3)
(57, 209, 3)
(66, 173, 3)
(66, 226, 3)
(49, 158, 3)
(58, 198, 3)
(80, 209, 3)
(82, 372, 3)
(58, 151, 3)
(58, 239, 3)
(58, 216, 3)
(66, 191, 3)
(58, 174, 3)
(33, 60, 3)
(57, 156, 3)
(66, 167, 3)
(66, 209, 3)
(57, 127, 3)
(57, 210, 3)
(66, 197, 3)
(57, 222, 3)
(66, 203, 3)
(80, 249, 3)
(49, 25, 3)
(49, 134, 3)
(57, 216, 3)
(66, 215, 3)
(57, 151, 3)
(66, 273, 3)
(58, 240, 3)
(72, 136, 3)
(50, 75, 3)
(66, 215, 3)
(66, 179, 3)
(58, 115, 3)
(50, 122, 3)
(58, 115, 3)
(58, 151, 3)
(80, 319, 3)
(33, 17, 3)
(95, 213, 3)
(58, 140, 3)
(66, 191, 3)
(58, 204, 3)
(21, 33, 3)
(58, 162, 3)
(58, 145, 3)
(74, 290, 3)
(58, 174, 3)
(66, 126, 3)
(12, 0, 3)
(0, 0, 3)
(65, 132, 3)
there are these with (12,0,3) or (0,0,3) that break the system. I tried to remove them from inference but i didn't manage to.

What can i do to prevent this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants