Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lat] Broken selector #514

Open
kylebgorman opened this issue Jan 31, 2024 · 1 comment
Open

[lat] Broken selector #514

kylebgorman opened this issue Jan 31, 2024 · 1 comment
Labels
bug Something isn't working language support Language-specific issues

Comments

@kylebgorman
Copy link
Collaborator

As of at least #509 the custom selector for Latin has been broken.

Latin has a custom selector because the headwords lack macrons. Now the Romans of course didn't use macrons (and they did not consistently indicate vowel length) but just about every modern-era pedagogical resource does, so this was frankly a bizarre decision by the editors, and requires us to find the macronized forms somewhere else on the page (namely in the etymology subsection), then merge these together with the pronunciations. To debug, the obvious thing to do is to mock up one or the other stream: the etymological macronized headwords, or the pronunciations, and see which one isn't matching.

It should be possible to push through with a new version of #509 and just ignore Latin, which is probably a good idea given that it's been a while since a big scrape has been merged.

@kylebgorman kylebgorman added bug Something isn't working language support Language-specific issues labels Jan 31, 2024
kylebgorman added a commit to kylebgorman/wikipron that referenced this issue Feb 21, 2024
kylebgorman added a commit that referenced this issue Feb 21, 2024
* [mlt] Updates Maltese phonelist.

Due either to bugs or changes in the upstream data, I noticed there was
a very high rate of filtration on Maltese. It seems that [u] was not
included, nor was one of the affricates.

There are still some filtration for "archaic" pronunciations of
[ɣ] for <għ>, which is WAI.

* Changelog

* Adds Python 3.12 support

* project classifers
* tests on CircleCI

* Revert "Adds Python 3.12 support"

This reverts commit e72bc3d.

* Pauses Latin testing in lieu of #514.

* Fixes typo in test_split

* More explicit comments.

* changelog

* Reruns black

* black
kylebgorman added a commit that referenced this issue Feb 21, 2024
* [mlt] Updates Maltese phonelist.

Due either to bugs or changes in the upstream data, I noticed there was
a very high rate of filtration on Maltese. It seems that [u] was not
included, nor was one of the affricates.

There are still some filtration for "archaic" pronunciations of
[ɣ] for <għ>, which is WAI.

* Changelog

* Adds Python 3.12 support

* project classifers
* tests on CircleCI

* Revert "Adds Python 3.12 support"

This reverts commit e72bc3d.

* Pauses Latin testing in lieu of #514.

* Fixes typo in test_split

* More explicit comments.

* changelog

* Reruns black

* black

* Adds Python 3.12 support

* Changelog

* updates flake8
@kylebgorman
Copy link
Collaborator Author

Testing of Latin is (hackily) paused in #520.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working language support Language-specific issues
Projects
None yet
Development

No branches or pull requests

1 participant