Allow using the differential binary-map of DB-Net as output. #1409

helpmefindaname · 2023-12-14T15:12:27Z

🚀 The feature

to the DBNet class, add a parameter use_binary_map: bool = False the model should behave the same as before when use_binary_map is False, but compute the "binarized map" described in the DBNet paper and return it in the outs["preds"]
e.g. if use_binary_map is True, the following code should be executed in a forward pass:

       if target is None or return_preds:
            # Post-process boxes (keep only text predictions)
            thresh_map = torch.sigmoid(self.thresh_head(feat_concat))
            bin_map = 1 / (1 + torch.exp(-50.0 * (prob_map - thresh_map)))

            out["preds"] = [
                dict(zip(self.class_names, preds))
                for preds in self.postprocessor(bin_map.detach().cpu().permute((0, 2, 3, 1)).numpy())
            ]

Motivation, pitch

An qualitative analysis of some receipts showed me, that the bin_map seemed more robust than the currently used prob_map and quantitative analysis showed an improvement in the CER (-0.5%) and WER (-1.4%), giving us an improvement at the cost of little additional computational needs.

Alternatives

No response

Additional context

An example of my qualitative analysis:

showing some spots where the prob_map is less confident on some spots while the bin_map marks the region better.

The text was updated successfully, but these errors were encountered:

helpmefindaname added the type: enhancement Improvement label Dec 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow using the differential binary-map of DB-Net as output. #1409

Allow using the differential binary-map of DB-Net as output. #1409

helpmefindaname commented Dec 14, 2023

Allow using the differential binary-map of DB-Net as output. #1409

Allow using the differential binary-map of DB-Net as output. #1409

Comments

helpmefindaname commented Dec 14, 2023

🚀 The feature

Motivation, pitch

Alternatives

Additional context