Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bypassing validation #341

Open
bmcfee opened this issue Jul 29, 2021 · 2 comments
Open

Bypassing validation #341

bmcfee opened this issue Jul 29, 2021 · 2 comments

Comments

@bmcfee
Copy link
Collaborator

bmcfee commented Jul 29, 2021

Warning, possibly bad ideas to follow.

mir_eval is generally pretty defensive about its inputs, and (i think?) all of the evaluators include at least some minimal validation of inputs for proper formatting, semantics, data types, etc. This is a good thing in general, but it can be quite wasteful when validation checks are expensive and performed many times on the same inputs. In a current project of mine, I'm finding that more than half of the computation time is going into validation, and most of this is wasted as the comparisons are of different model outputs against the same reference data.

This makes me think that we could provide a mechanism to bypass validation of input in situations where the user is certain that the data is well-formed. (Note: for now, I've simply commented out the validation lines in question, but that's obviously a bad solution in general.) We could control this by a context manager and a global state variable (yuck, I know), which would be pretty slick from a user perspective:

scores = mir_eval.chord.evaluate(...)  # Does validation checks as usual

with mir_eval.unsafe():
   scores = mir_eval.chord.evaluate(...)  # Skips validation checks

Ugly as a global state variable is, I think I prefer it to threading flag parameters through all evaluators here. This is primarily because we already bury quite a few things in kwargs, and it would be easy to lose track. Forcing a context manager for this would at least make it explicit and visually distinct from the user's perspective when this is happening.

Or it might just be a bad idea. 🤷 Thoughts?

@craffel
Copy link
Owner

craffel commented Jul 29, 2021

I mean, there's a case to be made that we shouldn't be validating at the start of each metric function. This means, for example, that when we call a submodule's evaluate function we are validating once for each metric called, which is kind of silly. We could just leave it up to the end-user to call validate before calling a metric function, or we could do something weird like call validate when an exception is raised in the metric function (to get a more helpful error message). In general though I think a better solution is to think through why a given validation function is so expensive, and whether it needs to be (i.e. if we can just be lazy and instead throw a useful error later when the issue happens inside of the metric function instead of validating ahead of time).

Alternatively, we could consider adding caching to the validators that process either the estimates or the references alone - that way if a validator gets called with the same reference multiple times, it will just used the cached result. Of course, this is only a good idea if the references are small in memory.

@bmcfee
Copy link
Collaborator Author

bmcfee commented Jul 29, 2021

or we could do something weird like call validate when an exception is raised in the metric function (to get a more helpful error message)

I don't think that would work -- most of the validation checks exist to prevent errors that would propagate silently.

OTOH, there's a good case to be made for validating once in evaluate and then bypassing subsequent validations in the individual metrics. This would need to be configurable though, if we're to continue allowing direct calls to individual metrics (which I think we should).

In general though I think a better solution is to think through why a given validation function is so expensive, and whether it needs to be

I agree in principle, but for most of the cases I have in mind, they're really as simple as they can feasibly be.

Alternatively, we could consider adding caching to the validators that process either the estimates or the references alone - that way if a validator gets called with the same reference multiple times, it will just used the cached result. Of course, this is only a good idea if the references are small in memory.

Yes, though caching in python is kind of a pain. The best option IMO is memoizing via joblib, but that doesn't provide any policies for cache eviction or size management.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants