Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Corruptors a la GeCO #175

Open
aflaxman opened this issue Apr 20, 2022 · 0 comments
Open

Data Corruptors a la GeCO #175

aflaxman opened this issue Apr 20, 2022 · 0 comments

Comments

@aflaxman
Copy link

I've been developing some data corruption algorithms (inspired by the documentation from https://dmm.anu.edu.au/geco/flex-data-gen-manual.pdf but not looking at the sourcecode, since it has an unusual license), and I wonder if your excellent project would be interested in some pull requests to incorporate python implementations in your recordlinkage.datasets submodule.

I'm imagining methods such as corrupt.ocr_noise(s : str) -> str. If this sounds of interest, I can put together a PR or use this ticket to further discuss the design. And if this is beyond the scope of what you want for your module, I understand!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant