Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose errant_compare functionality via the API #25

Open
Mindful opened this issue Mar 1, 2021 · 3 comments
Open

Expose errant_compare functionality via the API #25

Mindful opened this issue Mar 1, 2021 · 3 comments

Comments

@Mindful
Copy link

Mindful commented Mar 1, 2021

It would be great if the functionality in the errant_compare command were available for invocation as an API call, so it could be used for things like early stopping when training GEC models.

I've looked through the compare_m2 file and it doesn't look like it would be all that much work to refactor things so that everything worked the same way but it was possible to import a function that returned a dict with the computed scores instead of printing them, so if this is the kind of thing you'd be willing to accept a PR for, I'd be happy to give it a go myself sometime in the next couple weeks. If not, it would be super awesome if you were able to get to it at some point.

@chrisjbryant
Copy link
Owner

That's a good idea and something I've vaguely thought about before, but I'll definitely add it to my list of enhancements.
You're also right that it hopefully shouldn't involve too much refactoring. I'll take a look in the next couple of days and see how it goes!

@chrisjbryant
Copy link
Owner

Hmm, having looked into it and thought about it a bit more, I think it will require a bit more work than I anticipated.

The evaluation script is currently a standalone script that stores edits as lists, but I would probably want to convert them to Edit objects for full integration into ERRANT; otherwise it seems messy to compare hypothesis Edit objects against reference edit lists (or not use Edit objects at all). That's something I can still do, but it will require changing almost all the eval functions, so don't expect anything soon!

In the mean time, feel free to work on something yourself and submit a PR. I can't promise I'll accept it, but it should at least give me a better idea on how to implement this in the future. Alternatively, you could also try running errant_compare as a subprocess and then processing the stdout. It's ugly, but it should work.

@Mindful
Copy link
Author

Mindful commented Mar 4, 2021

I hadn't thought of using subprocess - that's a good point, although you're right that it's pretty ugly.

In terms of a PR, to be honest I was going to try and change as little as possible, by just refactoring the existing code slightly so that there was some kind of entry point function and a way to get the output as a dict of strings. I think what you're suggesting with the Edit objects probably makes more sense, so I'm not sure it's worth me producing a hacky PR. If I end up with extra time on my hands I'll give a rewrite using Edit objects a try and send you that as a PR, but it's also likely not anything I can do soon.

Fortunately I don't urgently need this, so it's enough for me if one of us gets to it at some point. If I find time and start working on a PR I'll post another comment so we don't end up doing the same work twice; otherwise, if/when you get to it I'll try to be your first user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants