Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(evaluate): add signwriting evaluation metrics #1104

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

AmitMY
Copy link

@AmitMY AmitMY commented Feb 10, 2024

This pull request fixes #1103

Changes here are complete (this works), but we might want to figure out a different solution, in case this is too much hardcoding for you all (perhaps something like an external evaluation script, and a single custom metric that can call that script)

Note! that unlike BLEU or other metrics, this evaluation takes all factors into account, as they matter for SignWriting. Even if using a non-factored setup though, this evaluation works.

What do you think? how can we integrate this into sockeye?
This is very important to me, to push forward the adoption of SignWriting as a written form of signed languages, allowing both transcription and translation models to be trained with sockeye.

Tests fail with the following error, despite an Nvidia Tesla T4 GPU available:

ValueError: Expected a cuda device, but got: CPU

Pull Request Checklist

  • Changes are complete (if posting work-in-progress code, prefix your pull request title with '[WIP]'
    until you can check this box.
  • Unit tests pass (pytest)
  • System tests pass (pytest test/system)
  • Passed code style checking (./style-check.sh)
  • You have considered writing a test
  • Updated major/minor version in sockeye/__init__.py. Major version bump if this is a backwards incompatible change.
  • Updated CHANGELOG.md

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@mjdenkowski
Copy link
Contributor

Hi Amit,

It's great to see that you have an implementation working! We can think about two questions for adding new features to Sockeye:

  1. How can we make the feature general enough to be broadly useful for the community?
  2. How can we make the feature available in a way where it can be easily understood and safely accessed?

For now, it looks like you have an implementation that works for your use case and that you could share with other interested users via a forked repository. In that case, you would control everything and wouldn't need to add extra layers of engineering.

Running your metric with the main version of Sockeye would be more challenging. Adding support for custom metric scripts is a great idea from a science perspective. Since Sockeye is also built to run in production environments, we need to avoid security vulnerabilities like arbitrary code execution. Designing, implementing, and documenting a secure way to run user-defined metrics would significantly increase the scope of this project.

I recommend using your fork of Sockeye for now and revisiting this if using the fork starts to block your work or the work of others working on SignWriting.

Best,
Michael

@AmitMY
Copy link
Author

AmitMY commented Feb 14, 2024

Thanks.
I think that since my implementation is generic for SignWriting outputs, it might be valid to introduce inside of sockeye.
It does not change the behavior for spoken language users, but all of a sudden allows for more experimentation with signed languages.

In any case, for the meanwhile I am successfully using my fork.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adding custom metrics
2 participants