Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

Upgrade sacremoses to 0.0.44 #1548

Open
barry-jin opened this issue Apr 12, 2021 · 0 comments
Open

Upgrade sacremoses to 0.0.44 #1548

barry-jin opened this issue Apr 12, 2021 · 0 comments
Labels
bug Something isn't working

Comments

@barry-jin
Copy link
Contributor

barry-jin commented Apr 12, 2021

Create this issue to track the updates in sacremoses, since the MosesPunctNormalizer in the sacremoses functions differently between release 0.0.43 and 0.0.44

For sacremoses==0.0.43

>>> from gluonnlp.data.filtering import MosesNormalizer
>>> normalizer = MosesNormalizer('en')
>>> normalizer('    hello  world!!".\t\t\r')
' hello world!!."  '

For sacremoses==0.0.44

>>> from gluonnlp.data.filtering import MosesNormalizer
>>> normalizer = MosesNormalizer('en')
>>> normalizer('    hello  world!!".\t\t\r')
'hello world!!."'

It looks like the output string is stripped.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant