Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

investigate ambiguous parsing of the -burg suffix in NL/DE #152

Open
missinglink opened this issue Oct 11, 2021 · 1 comment
Open

investigate ambiguous parsing of the -burg suffix in NL/DE #152

missinglink opened this issue Oct 11, 2021 · 1 comment
Labels
bug Something isn't working

Comments

@missinglink
Copy link
Member

missinglink commented Oct 11, 2021

Today we are merging pelias/api#1565 which brings a bunch of pelias/parser changes into pelias/api.

As part of this process we did some wider acceptance test checks and diff'd them against the current baseline.

One change which was identified was this query (at partial completion "grolmanstrasse 51, charlottenburg") which identifies the Berlin borough charlottenburg as a street.

 grolmanstrasse 51, charlottenburg, berlin
-FFFFFFFFFFFFFFFF0000000000000000000000000
+FFFFFFFFFFFFFFFF0000000000000000FFFF0FFF0

This was likely introduced in the recent NL work #126.

I would like to see if we can find a better way of handling the ambiguities between German and Dutch for the -burg suffix.

note: the correct solution is also being generated, but they both score the same, this scoring is based on matched token length so a robust fix would need to work equally well in cases where the len(street) < len(borough) as len(street) > len(borough) and len(street) == len(borough)

================================================================
SOLUTIONS (2ms)
----------------------------------------------------------------
(0.53) ➜ [ { housenumber: '51' }, { street: 'Charlottenburg' } ]

(0.53) ➜ [ { street: 'Grolmanstrasse' }, { housenumber: '51' } ]
@missinglink missinglink added the bug Something isn't working label Oct 11, 2021
@missinglink
Copy link
Member Author

related: #131 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant