[Feature] Support Multi-Scale Training for Text Detection #1714

Mountchicken · 2023-02-10T07:56:14Z

Multiscale training is an attractive trick for text detection since the scale of text is highly variable.

Supporting multi-scale training is simple, we only need to modify the generation of text target to use data_sample.batch_input_shape instead of data_sample.img_shape. This modification will not affect the existing detectors in mmocr, because their input size is fixed, i.e. data_sample.img_shape=data_sample.batch_input_shape.

To use multi-scale training, here is a simple config

train_pipeline = [
    dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
    dict(
        type='LoadOCRAnnotations',
        with_bbox=True,
        with_polygon=True,
        with_label=True,
    ),
    dict(
        type='RandomResize',
        scale=[(1280, 800), (1280, 1024)],
        keep_ratio=True),
    dict(
        type='PackTextDetInputs',
        meta_keys=('img_path', 'ori_shape', 'img_shape'))
]

…/ms_train

support multi scale train

023c813

mm-assistant bot assigned gaotongxiao Feb 10, 2023

Mountchicken added 2 commits February 10, 2023 16:26

Merge branch 'dev-1.x' of https://github.com/open-mmlab/mmocr into jq…

efd4a6d

…/ms_train

fix lint

0e2bd75

gaotongxiao requested a review from Harold-lkk February 20, 2023 06:37

gaotongxiao added the 1.0.0 label Feb 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support Multi-Scale Training for Text Detection #1714

[Feature] Support Multi-Scale Training for Text Detection #1714

Mountchicken commented Feb 10, 2023

[Feature] Support Multi-Scale Training for Text Detection #1714

Are you sure you want to change the base?

[Feature] Support Multi-Scale Training for Text Detection #1714

Conversation

Mountchicken commented Feb 10, 2023