Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add special phrase searching like nominatim #557

Open
JinIgarashi opened this issue Apr 2, 2021 · 9 comments
Open

Add special phrase searching like nominatim #557

JinIgarashi opened this issue Apr 2, 2021 · 9 comments

Comments

@JinIgarashi
Copy link
Contributor

I just found this article in nominatim website.

If you want to be able to search for places by their type through special key phrases you also need to enable these key phrases like this:

./utils/specialphrases.php --wiki-import > specialphrases.sql
psql -d nominatim -f specialphrases.sql

Note that this command downloads the phrases from the wiki link above. You need internet access for the step.

I think it might be better if we can search POI by key from Nominatim/Special Phrase. For instance, search bank type only at certain city.

I am not very sure whether it is possible technically. What do you think?

@lonvia
Copy link
Collaborator

lonvia commented Apr 6, 2021

It's technically possible but we are a long way off from a practical implementation.

@JinIgarashi
Copy link
Contributor Author

It's technically possible but we are a long way off from a practical implementation.

Thanks for the comments. This is just my humble suggestion. It would be great if photon can consider particular phrases to prioritize searching.

@kenseii
Copy link

kenseii commented May 19, 2021

@lonvia we are thinking about implementing something similar to what is being discussed in this issue.

Basically when searching or ordering the results, we would like some osm_values e.g: ['aerodrome', 'station','stop'] to be boosted, scored higher so that they show up at the top of the results compared to e.g: ['hotel','sauna'] even if the osm importance of the later would be higher.

Any idea or recommendation on how to implement this or where to look?

Is it better to perform this when querying, scoring or returning the data?

Is it something worth submitting a PR to the photon repo?

Thank you

@lonvia
Copy link
Collaborator

lonvia commented May 20, 2021

@kenseii It sounds like you are looking rather for a static boost by OSM type, i.e. the boost would be independent from the actual query. This issue is more about searching by keywords. 'tokyo station' would boost train stations, 'tokyo hotel' would boost hotels.

@kenseii
Copy link

kenseii commented May 24, 2021

@lonvia In order to make the search results dynamically boosted depending on the osm_value's type of the query, i am thinking of using synonyms.

e.g: if i search for Tokyo station, i would like to boost the results that have "station" as the osm_value.

the reason why we think synonyms are important is because the search query might hold a useful key that is not an osm_value

e.g searching with Narita airport:
Narita airport -> airport -> aerodrome, so we boost the result with aerodrome

e.g searching with Narita hotel:
Narita hotel -> hotel -> hotel, so we boost the result with hotel

We think that boosting based on dynamic osm_values would lead to dynamic bias.
Do you think this is a good approach?

@lonvia
Copy link
Collaborator

lonvia commented May 24, 2021

If you go down this road, you probably have to actually remove the word you used for a keyword from the query before matching against the document because the keyword and its synonyms might not show up in the name at all and you don't have a full match anymore. Or you have to add the OSM key/value as a keyword to the collector but that has its own disadvantages.

I'm currently in the process of experimenting with this stuff for a project, including experimenting with synonyms. We will see what comes out of this.

@kenseii
Copy link

kenseii commented May 25, 2021

Actually we are planning to use add the osm_key and osm_value to the collector inside a text field which is analyzed by a synonym analyzer.

By doing that we can boost based on whether a query matches that field.
What are the disadvantages of adding the osm key/value to the collector?

I'm currently in the process of experimenting with this stuff for a project, including experimenting with synonyms. We will see what comes out of this.

Glad to hear that, is this going to be open source?

@lonvia
Copy link
Collaborator

lonvia commented May 25, 2021

What are the disadvantages of adding the osm key/value to the collector?

They are English words that will interfere with searching. But it can work if you add a a symbolic replacement instead.

Glad to hear that, is this going to be open source?

Yes.

@kenseii
Copy link

kenseii commented May 28, 2021

@lonvia thank you very much for the PR #581,
i saw that it doesn't support multi-words or spaces and wanted to ask if there is a reason to it.

I was wondering if a graph token filter would help on synonyms with space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants