Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow using the differential binary-map of DB-Net as output. #1409

Open
helpmefindaname opened this issue Dec 14, 2023 · 0 comments
Open

Allow using the differential binary-map of DB-Net as output. #1409

helpmefindaname opened this issue Dec 14, 2023 · 0 comments
Labels

Comments

@helpmefindaname
Copy link
Contributor

馃殌 The feature

to the DBNet class, add a parameter use_binary_map: bool = False the model should behave the same as before when use_binary_map is False, but compute the "binarized map" described in the DBNet paper and return it in the outs["preds"]
e.g. if use_binary_map is True, the following code should be executed in a forward pass:

       if target is None or return_preds:
            # Post-process boxes (keep only text predictions)
            thresh_map = torch.sigmoid(self.thresh_head(feat_concat))
            bin_map = 1 / (1 + torch.exp(-50.0 * (prob_map - thresh_map)))

            out["preds"] = [
                dict(zip(self.class_names, preds))
                for preds in self.postprocessor(bin_map.detach().cpu().permute((0, 2, 3, 1)).numpy())
            ]

Motivation, pitch

An qualitative analysis of some receipts showed me, that the bin_map seemed more robust than the currently used prob_map and quantitative analysis showed an improvement in the CER (-0.5%) and WER (-1.4%), giving us an improvement at the cost of little additional computational needs.

Alternatives

No response

Additional context

An example of my qualitative analysis:
image
showing some spots where the prob_map is less confident on some spots while the bin_map marks the region better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant