Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSMOSE OpenData merge quests #5481

Open
XioNoX opened this issue Feb 10, 2024 · 13 comments
Open

OSMOSE OpenData merge quests #5481

XioNoX opened this issue Feb 10, 2024 · 13 comments
Labels
feedback required more info is needed, issue will be likely closed if it is not provided

Comments

@XioNoX
Copy link

XioNoX commented Feb 10, 2024

Hi,

In addition to running consistency checks on the OSM database, OSMOSE handles 3rd party OpenData sources for integration.

For example, implemented in osm-fr/osmose-backend#2143
Here is a map of all the bicycle parking that are present in my city's OpenData but are not in OSM, as well as the ones that are present, but could be improved (eg. missing tags, like capacity)
https://osmose.openstreetmap.fr/en/map/#source=448196&zoom=14&lat=48.39667&lon=-4.47183&item=xxxx&level=3
The same data as a table : https://osmose.openstreetmap.fr/en/issues/open?source=448196
Note that this url filters on source=448196 but only filtering on item=8150 would show the data for all sources and thus all cities.

Additionally APIs are available, (and support bbox filtering), for example:
https://osmose.openstreetmap.fr/api/0.3/issues?item=8150&bbox=-4.493634,48.382699,-4.483377,48.390152 for all the bicycle parking related "issues" in an area.
A given "issue" : https://osmose.openstreetmap.fr/api/0.3/issue/86fe9024-0549-7f06-63e6-ecbd824896cb
matching : https://osmose.openstreetmap.fr/en/issue/86fe9024-0549-7f06-63e6-ecbd824896cb
Note that there are already some translations in.
Full API doc: http://osmose.openstreetmap.fr/api/docs

Following up on this use-case, it would be extremely convenient to offer to StreetComplete users a "yes/no" quest such as "Is there a bicycle parking at this location" ? A yes would add the node and mark it as "resolved" on OSMOSE, a no would mark it as "false positive".

Multiple scenarios on which extra tags to add to the node:

  • Either we don't want to trust the OpenData further, and StreetComplete only adds amenity=bicycle_parking and leaves it to the following quests to add the capacity or if it's covered
  • We fully trust the OpenData, and automatically adds the tags suggested by OSMOSE
  • We display more info to the user (eg. "is there a 10 spots bicycle parking here?")

Of course the 3rd option makes it more complex to generically support OSMOSE mergers while keeping it user-friendly.
This also raises the question on how to handle the cases where the node is already in OSM but is missing some tags (or they're invalid). We can filter those with class=4 https://osmose.openstreetmap.fr/fr/issues/open?source=448196&class=4

  • Easiest 1st step is to ignore them
  • Maybe middle-ground would be to pre-populate the existing/matching quests, for example in the "how many spots does this bike parking have" already fill "12"
  • More complex would be to explicitly ask related questions (if there is no existing quest)

I'm using bicycle parking as example as they seem to be easy to work with, and are not just in France (thanks Madrid!):
https://osmose.openstreetmap.fr/en/issues/open?item=8150

If StreetComplete were to handle more of those OSMOSE OpenData "merge", each new category (item) would need to be reviewed independently. Even though many parts of the automation/UI could be re-used.
To have a more extensive view, you can filter on tag "merge" https://osmose.openstreetmap.fr/fr/issues/open?level=1,2,3&source=&class=3&tags=merge&username=&bbox=&limit=500 or with &item=8xxx unfortunately quality varies between the "mergers/integrations".

In France, postboxes could be another good candidate : https://osmose.openstreetmap.fr/en/issue/c50fb129-d484-17ce-49c7-b49d73b4051e

What do you think ?

@tordans
Copy link

tordans commented Feb 10, 2024

FYI, I think MapRoulette is a great candidate to become our shared source and tool to check of lists of OpenData sources. maproulette/maproulette3#1737 tracks the integration in mobile editors.

@matkoniecz
Copy link
Member

matkoniecz commented Feb 10, 2024

It is likely that I will integrate something like that, but based on ATP data (I am awaiting final decision whether such project will be funded).

I am not fan of idea of working with Osmose due to some false positives being unfixed for long time there and some low quality tagging advise being pushed there.

@matkoniecz
Copy link
Member

matkoniecz commented Feb 10, 2024

OSMOSE handles 3rd party OpenData sources for integration.

What is their process for ensuring that only data on license compatible with OSM ends there? Where it is documented?

Can you link to relevant documentation required as documented at https://wiki.openstreetmap.org/wiki/Import/Guidelines ? (I am not entirely sure is it needed in this case, but probably yes and I plan to do it for SC if I end implementing this feature)

@matkoniecz
Copy link
Member

MapRoulette

Note that MapRoulette is unsuitable for cases requiring on the ground verification. There is huge risk that one of mass clickers will join and mark all entries as verified without doing any verification whatsoever.

Though for bicycle parkings some areas may have some verifiable based on aerial (high quality aerial, no trees or other cover).

@matkoniecz
Copy link
Member

If StreetComplete were to handle more of those OSMOSE OpenData "merge", each new category (item) would need to be reviewed independently.

Is this data being reviewed already for quality? If yes, where it is happening and where this process is described?

If no - then how sources with garbage data quality are avoided?

@matkoniecz matkoniecz added the feedback required more info is needed, issue will be likely closed if it is not provided label Feb 10, 2024
@tordans
Copy link

tordans commented Feb 10, 2024

Note that MapRoulette is unsuitable for cases requiring on the ground verification.

I disagree. MapRoulette is a technical system that allows to work down a list of tasks. It is the responsibility of the person that creates those tasks to build and word them in a way that works for the intended use case. It is absolutely possible to use it in a mobile editor and a mobile context to work on a hyper local dataset with ground verification.

@XioNoX
Copy link
Author

XioNoX commented Feb 10, 2024

Thanks for your quick replies !
I don't know enough of maproulette to comment, however after exploring OSMOSE it seemed like a great match for the reasons listed previously. It also might make sens to me to keep that ticket focused on OSMOSE.

but based on ATP data

What is that ?

I am not fan of idea of working with Osmose due to some false positives being unfixed for long time there and some low quality tagging advise being pushed there.

I think we need to decouple OSMOSE the platform, from the various QA rules and merge data. I'm absolutely not advocating from displaying all the OSMOSE "issues" in StreetComplete, only a few, after a thorough review.

What is their process for ensuring that only data on license compatible with OSM ends there? Where it is documented?

My understanding is that it's done during the code review when ingesting a new data source, see for example the doc on https://github.com/osm-fr/osmose-backend/blob/main/doc/4-Merge.md#opendata-set-source
On one side using the attribution field, and on the other manually reviewing the source's license.
I'd assume that's it's a solved issue as OSMOSE is exclusively used for OSM.

Is this data being reviewed already for quality? If yes, where it is happening and where this process is described?
If no - then how sources with garbage data quality are avoided?

For my current (limited) experience, during the code review, and before the code is merged, a geojson of the "issues" is generated and manually reviewed to make sure the output is correct and no issues are present in the code. So they're really added on a case by case basis.
I'm of the opinion that it would be important to re-check the data before adding it to StreetComplete on a "item/category" basis to make sure the garbage levels are at a minimum.

@frodrigo
Copy link

FYI, I think MapRoulette is a great candidate to become our shared source and tool to check of lists of OpenData sources. maproulette/maproulette3#1737 tracks the integration in mobile editors.

Osmose is not just about the list of objects to be checked, but also the process of download/update the opendata set, map the properties and values to OSM tags, run the conflation every day and show the results.

Osmose also provide an export to MapRoulette. Challanges can be created from Osmose. The two tools are complementary.

I am not fan of idea of working with Osmose due to some false positives being unfixed for long time there and some low quality tagging advise being pushed there.

As I already say you, please report issues. Other Osmose contributors and I really pay attention to tags to what is suggested to contributors. Event if contributors area responsible of what they push to OSM, Osmose may introduce errors or bias, and we try to avoid this as possible.

What is their process for ensuring that only data on license compatible with OSM ends there? Where it is documented?

Sure, each dataset used is compliant with OSM licence.

Is this data being reviewed already for quality? If yes, where it is happening and where this process is described?

Yes the data configured is Osmose is always checked for quality.

Can you link to relevant documentation required as documented at https://wiki.openstreetmap.org/wiki/Import/Guidelines ? (I am not entirely sure is it needed in this case, but probably yes and I plan to do it for SC if I end implementing this feature)

That a real question, some local community say it is ok, other the opposite. It also depends if the data was already importes or not, and Osmose OpenData merge used just for gardening.

@matkoniecz
Copy link
Member

matkoniecz commented Feb 10, 2024

ATP

What is that ?

https://github.com/alltheplaces/alltheplaces

Sure, each dataset used is compliant with OSM licence.

Yes the data configured is Osmose is always checked for quality.

Do you know where this review is happening? Is outcome/recording publicly accessible somewhere?

If not, do you know where list of resources being imported is listed?

I would be really happy to use such datasets but I would not use it blindly and would at least verify that they are safe to use. I feel responsible for how code I write or deploy will be used (disclaimer: if user is malicious or careless then damage is possible and I would not feel responsible at all for it, and would refer such case to DWG for blocking).

So I would want to be sure that for example there is no CC-BY-SA data without waiver there and so on.

As I already say you, please report issues. Other Osmose contributors and I really pay attention to tags to what is suggested to contributors. Event if contributors area responsible of what they push to OSM, Osmose may introduce errors or bias, and we try to avoid this as possible.

I reported some (and many were fixed, thanks!) Though osm-fr/osmose-backend#381 osm-fr/osmose-backend#1094 osm-fr/osmose-backend#1152 are waiting for quiet long time now and for example osm-fr/osmose-backend#1159 got wontfixed.

I want to note that it is still one of better track records for software as far as my bug reporting goes, so maybe I have overly high expectations.

But as it is now I would definitely not treat Osmose advise as a good idea by default. I worry that the same can apply to datasets being suggested.

Osmose has some outright wrong advise and applying some of changes it suggests is a monumental waste of human time. And if some reports are not expected to be fixed manually then they should be in a separate category - at least I do it with https://matkoniecz.github.io/OSM-wikipedia-tag-validator-reports/Deutschland.html / https://matkoniecz.github.io/OSM-wikipedia-tag-validator-reports/Deutschland%20-%20obvious.html where humans are shown only cases where human review is worth doing. While large part of Osmose looks like request to manually do a bot edit.

That a real question, some local community say it is ok, other the opposite. It also depends if the data was already importes or not, and Osmose OpenData merge used just for gardening.

If deployed with StreetComplete on global scale it almost certainly would need to be done, probably per dataset. Or at least per group.

I hope that with ATP it can be done in general due to the same methodology, but maybe also there community will expect separate import process for each dataset.

@frodrigo
Copy link

There is more than 100 opendata datasets configured in Osmose. But except few (maybe less than 5) there are all in France. In France there is no incompatible opendata dataset license with OSM (by law).

In Spain there is CC-BY-SA, but with explicit agreement for OSM (and I do not have others in mind).

The only global dataset is Mapillary imagery detection.

All OpenDataset sources are listed in configuration/python files (analysers/analyser_merge_*.py)

@mnalis
Copy link
Member

mnalis commented Feb 11, 2024

For those interested, SCEE "Expert Edition" fork of StreetComplete does have a quest for OSMOSE which might be useful:

small_Screenshot_20240211_230413_SCEE

IIRC, it only displays the OSMOSE quest and allows user to edit mark them as false positive, or edit raw OSM tags manually (or of course use other SCEE functionality, like Add POI to add missing nodes).

@u6aab
Copy link

u6aab commented Apr 18, 2024

if open data merge requests are used, it should be done in a way where mappers can set the location themselves (like with the poi overlay), ive seen plenty of osmose bugs for missing fire hydrants in vespucci, and they were never in the correct place.

@frodrigo
Copy link

if open data merge requests are used, it should be done in a way where mappers can set the location themselves (like with the poi overlay), ive seen plenty of osmose bugs for missing fire hydrants in vespucci, and they were never in the correct place.

Yes. In Osmose we include Opendata from good sources only. But the idea in Osmose is that require contributors review of location and tags (as opposite an initial import is more effective).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feedback required more info is needed, issue will be likely closed if it is not provided
Projects
None yet
Development

No branches or pull requests

6 participants