Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with pyclipper inhomogeneous expanded array #12108

Merged
merged 6 commits into from
May 18, 2024

Conversation

zovelsanj
Copy link
Contributor

In case of det_box_type='poly', for some images, np.array(offset.Execute(distance)) can result in inhomogeneous part of the detection box list, which cannot be casted into numpy array directly. Due to this the following error occurs:

  File "C:\Users\ekser\OneDrive\Documents\OCR\cwat-integrator\venv_new\lib\site-packages\paddleocr\paddleocr.py", line 670, in ocr
    dt_boxes, rec_res, _ = self.__call__(img, max_dt_boxes, cls)
  File "C:\Users\ekser\OneDrive\Documents\OCR\cwat-integrator\venv_new\lib\site-packages\paddleocr\tools\infer\predict_system.py", line 76, in __call__
    dt_boxes, elapse = self.text_detector(binarize_img(img))
  File "C:\Users\ekser\OneDrive\Documents\OCR\cwat-integrator\venv_new\lib\site-packages\paddleocr\tools\infer\predict_det.py", line 318, in __call__
    post_result = self.postprocess_op(preds, shape_list)
  File "C:\Users\ekser\OneDrive\Documents\OCR\cwat-integrator\venv_new\lib\site-packages\paddleocr\ppocr\postprocess\db_postprocess.py", line 239, in __call__
    boxes, scores = self.polygons_from_bitmap(pred[batch_index],
  File "C:\Users\ekser\OneDrive\Documents\OCR\cwat-integrator\venv_new\lib\site-packages\paddleocr\ppocr\postprocess\db_postprocess.py", line 84, in polygons_from_bitmap
    box = self.unclip(points, self.unclip_ratio)
  File "C:\Users\ekser\OneDrive\Documents\OCR\cwat-integrator\venv_new\lib\site-packages\paddleocr\ppocr\postprocess\db_postprocess.py", line 159, in unclip
    return np.array(expanded)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.

This is because the expanded list from pyclipper's offset.Execute is an inhomogeneous list. For example, in the case of the following image:
image_8

the expanded list (before converting to numpy array) is as follows:
[[[47, 79], [45, 79], [46, 78]], [[56, 78], [58, 78], [59, 79], [55, 79]]] which is an inhomogeneous list and cannot be cast directly to the numpy array. Such expanded lists generally represent the detections with very small area thus resembling lines instead of polygons. Thus, it is better to eliminate such detections.

For some images, `np.array(offset.Execute(distance))` can result in inhomogeneous part of the detection box list, which cannot be casted into numpy array directly.
Copy link

paddle-bot bot commented May 13, 2024

Thanks for your contribution!

@GreatV
Copy link
Collaborator

GreatV commented May 15, 2024

not work for me

paddleocr --image_dir doc/imgs/1.jpg --det_box_type poly
[2024/05/15 15:16:12] ppocr DEBUG: Namespace(help='==SUPPRESS==', use_gpu=False, use_xpu=False, use_npu=False, use_mlu=False, ir_optim=True, use_tensorrt=False, min_subgraph_size=15, precision='fp32', gpu_mem=500, gpu_id=0, image_dir='doc/imgs/1.jpg', page_num=0, det_algorithm='DB', det_model_dir='/Users/wangxin/.paddleocr/whl/det/ch/ch_PP-OCRv4_det_infer', det_limit_side_len=960, det_limit_type='max', det_box_type='poly', det_db_thresh=0.3, det_db_box_thresh=0.6, det_db_unclip_ratio=1.5, max_batch_size=10, use_dilation=False, det_db_score_mode='fast', det_east_score_thresh=0.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_sast_score_thresh=0.5, det_sast_nms_thresh=0.2, det_pse_thresh=0, det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, scales=[8, 16, 32], alpha=1.0, beta=1.0, fourier_degree=5, rec_algorithm='SVTR_LCNet', rec_model_dir='/Users/wangxin/.paddleocr/whl/rec/ch/ch_PP-OCRv4_rec_infer', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_batch_num=6, max_text_length=25, rec_char_dict_path='/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/ppocr/utils/ppocr_keys_v1.txt', use_space_char=True, vis_font_path='./doc/fonts/simfang.ttf', drop_score=0.5, e2e_algorithm='PGNet', e2e_model_dir=None, e2e_limit_side_len=768, e2e_limit_type='max', e2e_pgnet_score_thresh=0.5, e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_pgnet_valid_set='totaltext', e2e_pgnet_mode='fast', use_angle_cls=False, cls_model_dir='/Users/wangxin/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_image_shape='3, 48, 192', label_list=['0', '180'], cls_batch_num=6, cls_thresh=0.9, enable_mkldnn=False, cpu_threads=10, use_pdserving=False, warmup=False, sr_model_dir=None, sr_image_shape='3, 32, 128', sr_batch_num=1, draw_img_save_dir='./inference_results', save_crop_res=False, crop_res_save_dir='./output', use_mp=False, total_process_num=1, process_id=0, benchmark=False, save_log_path='./log_output/', show_log=True, use_onnx=False, return_word_box=False, output='./output', table_max_len=488, table_algorithm='TableAttn', table_model_dir=None, merge_no_span_structure=True, table_char_dict_path=None, layout_model_dir=None, layout_dict_path=None, layout_score_threshold=0.5, layout_nms_threshold=0.5, kie_algorithm='LayoutXLM', ser_model_dir=None, re_model_dir=None, use_visual_backbone=True, ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ocr_order_method=None, mode='structure', image_orientation=False, layout=True, table=True, ocr=True, recovery=False, use_pdf2docx_api=False, invert=False, binarize=False, alphacolor=(255, 255, 255), lang='ch', det=True, rec=True, type='ocr', savefile=False, ocr_version='PP-OCRv4', structure_version='PP-StructureV2')
[2024/05/15 15:16:13] ppocr INFO: **********doc/imgs/1.jpg**********
Traceback (most recent call last):
  File "/Users/wangxin/miniconda3/envs/ppocr/bin/paddleocr", line 33, in <module>
    sys.exit(load_entry_point('paddleocr==2.8.0', 'console_scripts', 'paddleocr')())
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/paddleocr.py", line 880, in main
    result = engine.ocr(
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/paddleocr.py", line 727, in ocr
    dt_boxes, rec_res, _ = self.__call__(img, cls)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/tools/infer/predict_system.py", line 83, in __call__
    dt_boxes, elapse = self.text_detector(img)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/tools/infer/predict_det.py", line 379, in __call__
    dt_boxes, elapse = self.predict(img)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/tools/infer/predict_det.py", line 268, in predict
    post_result = self.postprocess_op(preds, shape_list)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/ppocr/postprocess/db_postprocess.py", line 240, in __call__
    boxes, scores = self.polygons_from_bitmap(
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/ppocr/postprocess/db_postprocess.py", line 92, in polygons_from_bitmap
    box = box.reshape(-1, 2)
AttributeError: 'list' object has no attribute 'reshape'

- box reshape was mistakenly done at line 145 which is now correctly done at line 92 of `db_postprocess.py`
- if box is empty then continue
- reverted mistakenly changed `box.array(box)` to `np.array(box)`
@zovelsanj
Copy link
Contributor Author

not work for me

paddleocr --image_dir doc/imgs/1.jpg --det_box_type poly
[2024/05/15 15:16:12] ppocr DEBUG: Namespace(help='==SUPPRESS==', use_gpu=False, use_xpu=False, use_npu=False, use_mlu=False, ir_optim=True, use_tensorrt=False, min_subgraph_size=15, precision='fp32', gpu_mem=500, gpu_id=0, image_dir='doc/imgs/1.jpg', page_num=0, det_algorithm='DB', det_model_dir='/Users/wangxin/.paddleocr/whl/det/ch/ch_PP-OCRv4_det_infer', det_limit_side_len=960, det_limit_type='max', det_box_type='poly', det_db_thresh=0.3, det_db_box_thresh=0.6, det_db_unclip_ratio=1.5, max_batch_size=10, use_dilation=False, det_db_score_mode='fast', det_east_score_thresh=0.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_sast_score_thresh=0.5, det_sast_nms_thresh=0.2, det_pse_thresh=0, det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, scales=[8, 16, 32], alpha=1.0, beta=1.0, fourier_degree=5, rec_algorithm='SVTR_LCNet', rec_model_dir='/Users/wangxin/.paddleocr/whl/rec/ch/ch_PP-OCRv4_rec_infer', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_batch_num=6, max_text_length=25, rec_char_dict_path='/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/ppocr/utils/ppocr_keys_v1.txt', use_space_char=True, vis_font_path='./doc/fonts/simfang.ttf', drop_score=0.5, e2e_algorithm='PGNet', e2e_model_dir=None, e2e_limit_side_len=768, e2e_limit_type='max', e2e_pgnet_score_thresh=0.5, e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_pgnet_valid_set='totaltext', e2e_pgnet_mode='fast', use_angle_cls=False, cls_model_dir='/Users/wangxin/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_image_shape='3, 48, 192', label_list=['0', '180'], cls_batch_num=6, cls_thresh=0.9, enable_mkldnn=False, cpu_threads=10, use_pdserving=False, warmup=False, sr_model_dir=None, sr_image_shape='3, 32, 128', sr_batch_num=1, draw_img_save_dir='./inference_results', save_crop_res=False, crop_res_save_dir='./output', use_mp=False, total_process_num=1, process_id=0, benchmark=False, save_log_path='./log_output/', show_log=True, use_onnx=False, return_word_box=False, output='./output', table_max_len=488, table_algorithm='TableAttn', table_model_dir=None, merge_no_span_structure=True, table_char_dict_path=None, layout_model_dir=None, layout_dict_path=None, layout_score_threshold=0.5, layout_nms_threshold=0.5, kie_algorithm='LayoutXLM', ser_model_dir=None, re_model_dir=None, use_visual_backbone=True, ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ocr_order_method=None, mode='structure', image_orientation=False, layout=True, table=True, ocr=True, recovery=False, use_pdf2docx_api=False, invert=False, binarize=False, alphacolor=(255, 255, 255), lang='ch', det=True, rec=True, type='ocr', savefile=False, ocr_version='PP-OCRv4', structure_version='PP-StructureV2')
[2024/05/15 15:16:13] ppocr INFO: **********doc/imgs/1.jpg**********
Traceback (most recent call last):
  File "/Users/wangxin/miniconda3/envs/ppocr/bin/paddleocr", line 33, in <module>
    sys.exit(load_entry_point('paddleocr==2.8.0', 'console_scripts', 'paddleocr')())
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/paddleocr.py", line 880, in main
    result = engine.ocr(
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/paddleocr.py", line 727, in ocr
    dt_boxes, rec_res, _ = self.__call__(img, cls)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/tools/infer/predict_system.py", line 83, in __call__
    dt_boxes, elapse = self.text_detector(img)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/tools/infer/predict_det.py", line 379, in __call__
    dt_boxes, elapse = self.predict(img)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/tools/infer/predict_det.py", line 268, in predict
    post_result = self.postprocess_op(preds, shape_list)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/ppocr/postprocess/db_postprocess.py", line 240, in __call__
    boxes, scores = self.polygons_from_bitmap(
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/ppocr/postprocess/db_postprocess.py", line 92, in polygons_from_bitmap
    box = box.reshape(-1, 2)
AttributeError: 'list' object has no attribute 'reshape'

Hi @GreatV, I have fixed the code now. Actually there should have been numpy array conversion at line 92. Just in case the image attached in the description doesn't reproduce the error, here is another one.
image.zip

@GreatV
Copy link
Collaborator

GreatV commented May 15, 2024

paddleocr --image_dir doc/imgs/1.jpg --det_box_type poly
[2024/05/15 15:45:48] ppocr DEBUG: Namespace(help='==SUPPRESS==', use_gpu=False, use_xpu=False, use_npu=False, use_mlu=False, ir_optim=True, use_tensorrt=False, min_subgraph_size=15, precision='fp32', gpu_mem=500, gpu_id=0, image_dir='doc/imgs/1.jpg', page_num=0, det_algorithm='DB', det_model_dir='/Users/wangxin/.paddleocr/whl/det/ch/ch_PP-OCRv4_det_infer', det_limit_side_len=960, det_limit_type='max', det_box_type='poly', det_db_thresh=0.3, det_db_box_thresh=0.6, det_db_unclip_ratio=1.5, max_batch_size=10, use_dilation=False, det_db_score_mode='fast', det_east_score_thresh=0.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_sast_score_thresh=0.5, det_sast_nms_thresh=0.2, det_pse_thresh=0, det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, scales=[8, 16, 32], alpha=1.0, beta=1.0, fourier_degree=5, rec_algorithm='SVTR_LCNet', rec_model_dir='/Users/wangxin/.paddleocr/whl/rec/ch/ch_PP-OCRv4_rec_infer', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_batch_num=6, max_text_length=25, rec_char_dict_path='/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/ppocr/utils/ppocr_keys_v1.txt', use_space_char=True, vis_font_path='./doc/fonts/simfang.ttf', drop_score=0.5, e2e_algorithm='PGNet', e2e_model_dir=None, e2e_limit_side_len=768, e2e_limit_type='max', e2e_pgnet_score_thresh=0.5, e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_pgnet_valid_set='totaltext', e2e_pgnet_mode='fast', use_angle_cls=False, cls_model_dir='/Users/wangxin/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_image_shape='3, 48, 192', label_list=['0', '180'], cls_batch_num=6, cls_thresh=0.9, enable_mkldnn=False, cpu_threads=10, use_pdserving=False, warmup=False, sr_model_dir=None, sr_image_shape='3, 32, 128', sr_batch_num=1, draw_img_save_dir='./inference_results', save_crop_res=False, crop_res_save_dir='./output', use_mp=False, total_process_num=1, process_id=0, benchmark=False, save_log_path='./log_output/', show_log=True, use_onnx=False, return_word_box=False, output='./output', table_max_len=488, table_algorithm='TableAttn', table_model_dir=None, merge_no_span_structure=True, table_char_dict_path=None, layout_model_dir=None, layout_dict_path=None, layout_score_threshold=0.5, layout_nms_threshold=0.5, kie_algorithm='LayoutXLM', ser_model_dir=None, re_model_dir=None, use_visual_backbone=True, ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ocr_order_method=None, mode='structure', image_orientation=False, layout=True, table=True, ocr=True, recovery=False, use_pdf2docx_api=False, invert=False, binarize=False, alphacolor=(255, 255, 255), lang='ch', det=True, rec=True, type='ocr', savefile=False, ocr_version='PP-OCRv4', structure_version='PP-StructureV2')
[2024/05/15 15:45:49] ppocr INFO: **********doc/imgs/1.jpg**********
Traceback (most recent call last):
  File "/Users/wangxin/miniconda3/envs/ppocr/bin/paddleocr", line 33, in <module>
    sys.exit(load_entry_point('paddleocr==2.8.0', 'console_scripts', 'paddleocr')())
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/paddleocr.py", line 880, in main
    result = engine.ocr(
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/paddleocr.py", line 727, in ocr
    dt_boxes, rec_res, _ = self.__call__(img, cls)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/tools/infer/predict_system.py", line 83, in __call__
    dt_boxes, elapse = self.text_detector(img)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/tools/infer/predict_det.py", line 379, in __call__
    dt_boxes, elapse = self.predict(img)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/tools/infer/predict_det.py", line 272, in predict
    dt_boxes = self.filter_tag_det_res_only_clip(dt_boxes, ori_im.shape)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/tools/infer/predict_det.py", line 212, in filter_tag_det_res_only_clip
    dt_boxes = np.array(dt_boxes_new)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.
(ppocr) (ppocr) 

@GreatV
Copy link
Collaborator

GreatV commented May 15, 2024

paddleocr --image_dir doc/imgs/1.jpg
[2024/05/15 15:45:36] ppocr DEBUG: Namespace(help='==SUPPRESS==', use_gpu=False, use_xpu=False, use_npu=False, use_mlu=False, ir_optim=True, use_tensorrt=False, min_subgraph_size=15, precision='fp32', gpu_mem=500, gpu_id=0, image_dir='doc/imgs/1.jpg', page_num=0, det_algorithm='DB', det_model_dir='/Users/wangxin/.paddleocr/whl/det/ch/ch_PP-OCRv4_det_infer', det_limit_side_len=960, det_limit_type='max', det_box_type='quad', det_db_thresh=0.3, det_db_box_thresh=0.6, det_db_unclip_ratio=1.5, max_batch_size=10, use_dilation=False, det_db_score_mode='fast', det_east_score_thresh=0.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_sast_score_thresh=0.5, det_sast_nms_thresh=0.2, det_pse_thresh=0, det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, scales=[8, 16, 32], alpha=1.0, beta=1.0, fourier_degree=5, rec_algorithm='SVTR_LCNet', rec_model_dir='/Users/wangxin/.paddleocr/whl/rec/ch/ch_PP-OCRv4_rec_infer', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_batch_num=6, max_text_length=25, rec_char_dict_path='/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/ppocr/utils/ppocr_keys_v1.txt', use_space_char=True, vis_font_path='./doc/fonts/simfang.ttf', drop_score=0.5, e2e_algorithm='PGNet', e2e_model_dir=None, e2e_limit_side_len=768, e2e_limit_type='max', e2e_pgnet_score_thresh=0.5, e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_pgnet_valid_set='totaltext', e2e_pgnet_mode='fast', use_angle_cls=False, cls_model_dir='/Users/wangxin/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_image_shape='3, 48, 192', label_list=['0', '180'], cls_batch_num=6, cls_thresh=0.9, enable_mkldnn=False, cpu_threads=10, use_pdserving=False, warmup=False, sr_model_dir=None, sr_image_shape='3, 32, 128', sr_batch_num=1, draw_img_save_dir='./inference_results', save_crop_res=False, crop_res_save_dir='./output', use_mp=False, total_process_num=1, process_id=0, benchmark=False, save_log_path='./log_output/', show_log=True, use_onnx=False, return_word_box=False, output='./output', table_max_len=488, table_algorithm='TableAttn', table_model_dir=None, merge_no_span_structure=True, table_char_dict_path=None, layout_model_dir=None, layout_dict_path=None, layout_score_threshold=0.5, layout_nms_threshold=0.5, kie_algorithm='LayoutXLM', ser_model_dir=None, re_model_dir=None, use_visual_backbone=True, ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ocr_order_method=None, mode='structure', image_orientation=False, layout=True, table=True, ocr=True, recovery=False, use_pdf2docx_api=False, invert=False, binarize=False, alphacolor=(255, 255, 255), lang='ch', det=True, rec=True, type='ocr', savefile=False, ocr_version='PP-OCRv4', structure_version='PP-StructureV2')
[2024/05/15 15:45:37] ppocr INFO: **********doc/imgs/1.jpg**********
Traceback (most recent call last):
  File "/Users/wangxin/miniconda3/envs/ppocr/bin/paddleocr", line 33, in <module>
    sys.exit(load_entry_point('paddleocr==2.8.0', 'console_scripts', 'paddleocr')())
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/paddleocr.py", line 880, in main
    result = engine.ocr(
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/paddleocr.py", line 727, in ocr
    dt_boxes, rec_res, _ = self.__call__(img, cls)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/tools/infer/predict_system.py", line 83, in __call__
    dt_boxes, elapse = self.text_detector(img)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/tools/infer/predict_det.py", line 379, in __call__
    dt_boxes, elapse = self.predict(img)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/tools/infer/predict_det.py", line 268, in predict
    post_result = self.postprocess_op(preds, shape_list)
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/ppocr/postprocess/db_postprocess.py", line 246, in __call__
    boxes, scores = self.boxes_from_bitmap(
  File "/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/ppocr/postprocess/db_postprocess.py", line 143, in boxes_from_bitmap
    box = self.unclip(points, self.unclip_ratio).reshape(-1, 1, 2)
AttributeError: 'list' object has no attribute 'reshape'

@GreatV
Copy link
Collaborator

GreatV commented May 15, 2024

hi @zovelsanj, please ensure it works well for both 'quad' and 'poly'.

For `--det_box_type = poly`, pad the detected polygon arrays if they have different shapes to ensure even shapes of polygon arrays
@zovelsanj
Copy link
Contributor Author

hi @zovelsanj, please ensure it works well for both 'quad' and 'poly'.

Hi @GreatV, thanks. I only checked it for --det_box_type poly. I have now made the corresponding changes for --det_box_type quad as well. Also, I have added polygon padding to address the inhomogeneous part issue for the case of poly. These should fix all the issues you encountered.

@GreatV
Copy link
Collaborator

GreatV commented May 15, 2024

@zovelsanj, please install and configure pre-commit to pass the CI checks. I will run some tests to verify if it is working correctly.

@GreatV
Copy link
Collaborator

GreatV commented May 18, 2024

It works.

paddleocr --image_dir doc/imgs/1.jpg
[2024/05/18 09:05:33] ppocr DEBUG: Namespace(help='==SUPPRESS==', use_gpu=False, use_xpu=False, use_npu=False, use_mlu=False, ir_optim=True, use_tensorrt=False, min_subgraph_size=15, precision='fp32', gpu_mem=500, gpu_id=0, image_dir='doc/imgs/1.jpg', page_num=0, det_algorithm='DB', det_model_dir='/Users/wangxin/.paddleocr/whl/det/ch/ch_PP-OCRv4_det_infer', det_limit_side_len=960, det_limit_type='max', det_box_type='quad', det_db_thresh=0.3, det_db_box_thresh=0.6, det_db_unclip_ratio=1.5, max_batch_size=10, use_dilation=False, det_db_score_mode='fast', det_east_score_thresh=0.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_sast_score_thresh=0.5, det_sast_nms_thresh=0.2, det_pse_thresh=0, det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, scales=[8, 16, 32], alpha=1.0, beta=1.0, fourier_degree=5, rec_algorithm='SVTR_LCNet', rec_model_dir='/Users/wangxin/.paddleocr/whl/rec/ch/ch_PP-OCRv4_rec_infer', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_batch_num=6, max_text_length=25, rec_char_dict_path='/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/ppocr/utils/ppocr_keys_v1.txt', use_space_char=True, vis_font_path='./doc/fonts/simfang.ttf', drop_score=0.5, e2e_algorithm='PGNet', e2e_model_dir=None, e2e_limit_side_len=768, e2e_limit_type='max', e2e_pgnet_score_thresh=0.5, e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_pgnet_valid_set='totaltext', e2e_pgnet_mode='fast', use_angle_cls=False, cls_model_dir='/Users/wangxin/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_image_shape='3, 48, 192', label_list=['0', '180'], cls_batch_num=6, cls_thresh=0.9, enable_mkldnn=False, cpu_threads=10, use_pdserving=False, warmup=False, sr_model_dir=None, sr_image_shape='3, 32, 128', sr_batch_num=1, draw_img_save_dir='./inference_results', save_crop_res=False, crop_res_save_dir='./output', use_mp=False, total_process_num=1, process_id=0, benchmark=False, save_log_path='./log_output/', show_log=True, use_onnx=False, return_word_box=False, output='./output', table_max_len=488, table_algorithm='TableAttn', table_model_dir=None, merge_no_span_structure=True, table_char_dict_path=None, layout_model_dir=None, layout_dict_path=None, layout_score_threshold=0.5, layout_nms_threshold=0.5, kie_algorithm='LayoutXLM', ser_model_dir=None, re_model_dir=None, use_visual_backbone=True, ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ocr_order_method=None, mode='structure', image_orientation=False, layout=True, table=True, ocr=True, recovery=False, use_pdf2docx_api=False, invert=False, binarize=False, alphacolor=(255, 255, 255), lang='ch', det=True, rec=True, type='ocr', savefile=False, ocr_version='PP-OCRv4', structure_version='PP-StructureV2')
[2024/05/18 09:05:34] ppocr INFO: **********doc/imgs/1.jpg**********
[2024/05/18 09:05:34] ppocr DEBUG: dt_boxes num : 2, elapsed : 0.3739910125732422
[2024/05/18 09:05:34] ppocr DEBUG: rec_res num  : 2, elapsed : 0.35309481620788574
[2024/05/18 09:05:34] ppocr INFO: [[[296.0, 299.0], [331.0, 298.0], [346.0, 849.0], [311.0, 850.0]], ('土地整治与土壤修复研究中心', 0.9767227172851562)]
[2024/05/18 09:05:34] ppocr INFO: [[[346.0, 300.0], [378.0, 299.0], [386.0, 663.0], [354.0, 664.0]], ('华南农业大学—东图', 0.8495075702667236)]
paddleocr --image_dir doc/imgs/1.jpg --det_box_type poly
[2024/05/18 09:05:51] ppocr DEBUG: Namespace(help='==SUPPRESS==', use_gpu=False, use_xpu=False, use_npu=False, use_mlu=False, ir_optim=True, use_tensorrt=False, min_subgraph_size=15, precision='fp32', gpu_mem=500, gpu_id=0, image_dir='doc/imgs/1.jpg', page_num=0, det_algorithm='DB', det_model_dir='/Users/wangxin/.paddleocr/whl/det/ch/ch_PP-OCRv4_det_infer', det_limit_side_len=960, det_limit_type='max', det_box_type='poly', det_db_thresh=0.3, det_db_box_thresh=0.6, det_db_unclip_ratio=1.5, max_batch_size=10, use_dilation=False, det_db_score_mode='fast', det_east_score_thresh=0.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_sast_score_thresh=0.5, det_sast_nms_thresh=0.2, det_pse_thresh=0, det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, scales=[8, 16, 32], alpha=1.0, beta=1.0, fourier_degree=5, rec_algorithm='SVTR_LCNet', rec_model_dir='/Users/wangxin/.paddleocr/whl/rec/ch/ch_PP-OCRv4_rec_infer', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_batch_num=6, max_text_length=25, rec_char_dict_path='/Users/wangxin/miniconda3/envs/ppocr/lib/python3.10/site-packages/paddleocr-2.8.0-py3.10.egg/paddleocr/ppocr/utils/ppocr_keys_v1.txt', use_space_char=True, vis_font_path='./doc/fonts/simfang.ttf', drop_score=0.5, e2e_algorithm='PGNet', e2e_model_dir=None, e2e_limit_side_len=768, e2e_limit_type='max', e2e_pgnet_score_thresh=0.5, e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_pgnet_valid_set='totaltext', e2e_pgnet_mode='fast', use_angle_cls=False, cls_model_dir='/Users/wangxin/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_image_shape='3, 48, 192', label_list=['0', '180'], cls_batch_num=6, cls_thresh=0.9, enable_mkldnn=False, cpu_threads=10, use_pdserving=False, warmup=False, sr_model_dir=None, sr_image_shape='3, 32, 128', sr_batch_num=1, draw_img_save_dir='./inference_results', save_crop_res=False, crop_res_save_dir='./output', use_mp=False, total_process_num=1, process_id=0, benchmark=False, save_log_path='./log_output/', show_log=True, use_onnx=False, return_word_box=False, output='./output', table_max_len=488, table_algorithm='TableAttn', table_model_dir=None, merge_no_span_structure=True, table_char_dict_path=None, layout_model_dir=None, layout_dict_path=None, layout_score_threshold=0.5, layout_nms_threshold=0.5, kie_algorithm='LayoutXLM', ser_model_dir=None, re_model_dir=None, use_visual_backbone=True, ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ocr_order_method=None, mode='structure', image_orientation=False, layout=True, table=True, ocr=True, recovery=False, use_pdf2docx_api=False, invert=False, binarize=False, alphacolor=(255, 255, 255), lang='ch', det=True, rec=True, type='ocr', savefile=False, ocr_version='PP-OCRv4', structure_version='PP-StructureV2')
[2024/05/18 09:05:51] ppocr INFO: **********doc/imgs/1.jpg**********
[2024/05/18 09:05:51] ppocr DEBUG: dt_boxes num : 2, elapsed : 0.35208606719970703
[2024/05/18 09:05:52] ppocr DEBUG: rec_res num  : 2, elapsed : 0.39710426330566406
[2024/05/18 09:05:52] ppocr INFO: [[[321, 303], [324, 305], [327, 309], [329, 319], [329, 321], [338, 684], [343, 810], [343, 811], [340, 840], [339, 843], [335, 847], [330, 847], [328, 846], [317, 841], [314, 839], [313, 834], [307, 486], [300, 339], [301, 313], [302, 309], [306, 304], [311, 302], [317, 302], [317, 302], [317, 302], [317, 302], [317, 302], [317, 302], [317, 302], [317, 302], [317, 302], [317, 302], [317, 302]], ('土地整治与土壤修复研究中心', 0.9865721464157104)]
[2024/05/18 09:05:52] ppocr INFO: [[[369, 307], [372, 309], [374, 313], [377, 356], [375, 416], [379, 456], [379, 513], [377, 523], [377, 545], [379, 561], [380, 648], [379, 652], [375, 658], [372, 660], [367, 660], [364, 659], [361, 656], [360, 654], [359, 649], [358, 648], [356, 540], [358, 540], [355, 532], [354, 531], [355, 474], [352, 418], [353, 355], [349, 340], [349, 339], [351, 313], [353, 308], [355, 305], [359, 303]], ('华南农业大学—东园', 0.868368923664093)]

Copy link
Collaborator

@GreatV GreatV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@GreatV GreatV merged commit 502e167 into PaddlePaddle:main May 18, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants