Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Layers - Technical Planning #2506

Closed
westnordost opened this issue Jan 15, 2021 · 29 comments
Closed

Layers - Technical Planning #2506

westnordost opened this issue Jan 15, 2021 · 29 comments

Comments

@westnordost
Copy link
Member

For the Layers #2461 concept, there is some deep code restructuring that has to be made. Here is an attempt to summarize it.

Summary

  • All OSM data downloaded plus generated geometry must be persisted. The app should take care of dumping old unneeded OSM data automatically.
  • The locally persisted OSM data and geometry must be accessible (loaded into memory) efficiently by bounding box because not all local data should be displayed / processed at the same time.
  • any change made on the OSM data (tag changes, deletions, split ways, reverts of those) must be added to a single persisted queue of actions (~diff) to be performed
  • when the OSM data is loaded into memory for display (in layers), for quest creation etc., the queue of actions must be performed on it first so to work with a "working set" of data rather than only the original data downloaded from OSM
  • on upload, this queue of actions is applied to the original OSM data one by one and the affected elements updated from the server until that queue is empty

Milestones

  1. Change the download, to keep all the OSM data and manage its cleanup.
  2. Read-only version of a layer (vertically): data management, layer definition, UI
  3. Change the way changes (actions) are stored, treated and processed

M1

Download

On download of OSM data, element geometry for all downloaded elements must be created and persisted. Currently, this is only done for any elements that get a quest.

In detail:

  • rename QuestDownloader to OsmDownloader or similar. Generalize it. Put the notes download into own class
  • the OsmApiQuestDownloader should not download OSM data by itself (so, rename) but having it handed to it. Maybe by OsmDownloader or maybe there is some listener that fires whenever data in the OSM elements table changed and the OsmApiQuestDownloader is informed of that, updates the quests based on that. Maybe there is an oppurtunity to merge the logic in OsmApiQuestDownloader with OsmQuestGiver

Data Cleanup

A last update timestamp for each element must be added to be able to periodically dump old data. Based on that, data older than X days not referenced by an action must be dumped.

In detail:

  • add last update timestamp to NodesTable, WaysTable, RelationsTable
  • add cleanup function to the associated DAOs, call it f.e. on each application startup, or on each download. The "fresh" time should be the same as the time it takes for unsolved quests to be deleted. Same mechanism
  • note: only data should be deleted that is not referenced by other tables, i.e. the OsmQuest table or later the Actions table or whatever one may call it
  • a user option to control how long "quest data" should be stored would be possible, but optional

Data access

The persisted OSM data must be accessible efficiently. Thanks to the ElementGeometryTable, this is basically already possible. The appropriate DAO(s) MergedElementDao(?) just needs a new method and/or database View with index to do that.

Element update

Whenever an element is updated (after upload), quests are created/removed for that element. Now that all downloaded data is available offline, isApplicableTo and getApplicableElements could maybe be merged somehow.

In detail:

  • not really sure yet. Should the quest type trigger fetching MapDataWithGeometry from database, or should it be handed to it? What data (in what radius) should it contain?

M2

TODO TBD

M3

TODO TBD

@westnordost
Copy link
Member Author

westnordost commented Jan 28, 2021

So this is how the data flow currently looks now in my development branch. The graph is not complete of course:

data_flow

Legend

  • *Downloader - classes that download something. Usually call a *Controller to do something with the data
  • *Controller - manages access to data. Usually controls one or more *Daos and constitutes the access to that data to the outside. Usually allows listening in to changes on that data
  • *Source - a read-only way to access certain data, usually an interface implemented by a *Controller
  • *Updater - a class that takes one or more data sources and generates other data out of that
  • *Dao - classes that persist data in a table ("Data Access Object"). Some classes allow listening in to changes but maybe I should move that part of the functionality always to *Controllers for consistency (later)
  • *Uploader - uploads data. Usually call a *Controller if the upload should lead to something in the local data to be updated

Explanation

The "visible quests" (i.e. those that are displayed on the map) are now just an end product and not (necessarily) persisted themselves in the DB). Rather, the OSM elements and OSM notes are "first class citizens", the quests are just generated from this.

For notes, there is not even a OsmNoteQuestsTable anymore, the note quests are generated on-demand from the notes, from the information which note quests are hidden, which note quests have been answered (CommentNote table) and other things.

Since it is quite expensive to generate quests from raw OSM data, there is still a OsmQuestsTable. It is updated each time anything changes in the OSM data.

@westnordost
Copy link
Member Author

westnordost commented Jan 28, 2021

Open Ends

  • Currently, upading OsmQuestSplitWayDao, DeleteOsmElementDao, UndoOsmQuestDao etc are updated through QuestController (not pictured) which makes sure that the quests are also updated (hidden/removed/...). It is okay like this now, but actually, changes made on the OSM elements should be managed by OsmElementController instead.

@westnordost
Copy link
Member Author

I seem unable to find a "compromise-point" at which to stop refactoring, also in respect to that I'll have to go the whole way anyway very soon (in M3). This is the intermediate state.

Colors don't really mean anything, they are just for myself: Blue means that I already added unit tests for that, yellow means that unit tests are missing, red means that an implementation is missing. Green means that it is an interface, thus no tests are necessary.

overview_upload-Page-2

So what is still open is that in the current setup, when a quest is solved by f.e. splitting a way, leaving a note or answering it normally, the quests (OsmQuestController) are not updated.
The optimal case (that should be reached for M3) is that the changes made through answering the quest should be directly applied to the local elements in some kind of working set and it is that working set the OsmQuestController is listening to for updates in quests that should be displayed.

@westnordost
Copy link
Member Author

westnordost commented Feb 20, 2021

Still work in progress...

overview_upload

Advancing only slowly. Reworked how the edits are stored and uploaded. Next step is to put a class in between MapDataSource and OsmQuestController that aggregates data from MapDataController (=data from OSM API) and ElementEditsController (=local changes to this data that haven't been uploaded yet).

And what is also missing is to do the same for notes.

@westnordost
Copy link
Member Author

overview_upload

@westnordost
Copy link
Member Author

So, refactored the notes-part now. This by the way makes it technically possible to create a note, then comment several times on it. All offline. The edited note (=note with added comment) could also be shown in the UI as if the comment has already been uploaded. This functionality is of course not used yet, but it becomes possible.

Now, I need to write a lot of tests, then test it in the app.

As an optional extension / next refactor, there could be another class through which the edit history could be shown and all the things undone in any order. All edits that have not been uploaded yet will then be undoable, edits that have already been uploaded require special implementation to be revertable (I.e. normal quest answers / tag changes are revertable, the rest not at the moment. There is a ticket in the issue tracker to enable reverting deletions.)

@westnordost
Copy link
Member Author

Status update:

  • added all them tests, found a few bugs along the way
  • started the app first time in 2 months (from that dev branch), currently improving the logging
  • next step is to do some performance testing - there are a few obvious points at which things could be speed up. The architecture now is slower then in the architecture currently used in StreetComplete mainly because now ALL OSM elements are persisted
  • the step after that would be to test normal usage, then all those edge cases

@westnordost
Copy link
Member Author

westnordost commented Mar 7, 2021

Currently a download looks like this. 11.3 seconds for 34152 elements and 7 notes.
Some of the tasks below run in parallel, denoted by the starting letter. (different leter = runs parallel, same letter = runs serialized)

A NotesDownload: Downloaded 7 notes in 0.1s
A NoteController: Persisted 6 and deleted 0 notes in 0.0s
B MapTilesDownload: Downloaded 36 tiles (509kB downloaded, 976kB already cached) in 1.0s
A OsmAvatarsDownload: Downloaded 5 avatar images in 1.0s
C MapDataDownload: Downloaded 29667 nodes, 4372 ways and 150 relations in 4.8s
C MapDataController: Persisted 34152 and deleted 0 elements and geometries in 3.9s
C OsmQuestController: Created 2246 quests for bbox in 2.0s
C OsmQuestController: Persisted 2115 new and removed 0 already resolved quests in 0.1s
Finished download in 11.3s

@westnordost
Copy link
Member Author

There are two things that may be improved in terms of download speed:

  • I think nothing speaks against already analyzing the OSM data for quests while these are persisted in the db. One is an I/O operation, the other ist CPU/RAM. That could potentially reduce the time to create quests to zero as it seems it always takes longer to persist.
  • I found out earlier that Java Date parsing takes up about 50% of all the parsing time for a map data response. Maybe this could be sped up by not parsing the date at all or parsing it in some faster fashion. That could cut the download time in (up to) half.

So in total, it looks like the download time could be reduced by about 50% (maximum).

@westnordost
Copy link
Member Author

westnordost commented Mar 8, 2021

Ok, implemented option 1.
For some random test location (my location), the download completes after 8.1 seconds with current master and 8.8 seconds in the dev branch. I think that's okay. Though, times on different times differed a lot: If the download hits some Java memory limits, it gets much slower (one time, it was 70s or so for the same location). Since for the version on the dev branch, a lot more objects are persisted, I think the potential is higher to hit some limits there.

Option 2 might cut this time further into half but would require to

  1. change the osmapi-library to do no Date parsing but instead let the date parsing be done by the MapDataFactory
  2. find a way to do the date parsing more efficiently. We also not only need to parse dates, but we need to be able to compare them, because of the "check again"-quests. The date classes from Java 8 should be more performant, however, they are not available on Andorid till API level 26 unless API desugaring is turned on. A better candidate, however, in view of iOS version #1892, might be the kotlinx-datetime library which works cross-platform. That library is in the pre-release phase though, so, not clear how stable it is and how much of the API may still change.
  3. replace all the occurances where Dates are used. This includes the OSM elements (nodes, ways, relations), so it makes sense to combine that with exchanging data structures from the osmapi library with pure kotlin data structures, see iOS version #1892 (comment)

All in all, to do it properly, it's quite a stack of things to refactor. Certainly, some things can be done before and some later, but the full potential is only unravelled when all these are done. So, I think I'd postpone this to some unknown time in the future, unless someone would like to help me here.

@westnordost
Copy link
Member Author

westnordost commented Mar 8, 2021

Just to summarize the advantages of doing this:

  1. replace Date-parsing with kotlinx-datetime: possibly almost 100% faster download times, one step toward making an iOS port feasible
  2. replace Kryo with kotlinx.serialization JSON: better forward compatibility of persisted data, one step toward making an iOS port feasible
  3. replace osmapi data classes (OsmNode, OsmWay, OsmRelation, OsmLatLon, Note, BoundingBox etc) with data classes defined in project: Prerequisite for 1 and 2, slicker usage, one step toward making an iOS port feasible

I am just going to ping the usual suspects here, also those that are interested in an iOS version. Maybe if we did this together, we could refactor this quite quickly: @FloEdelmann @arrival-spring @wtimme @matkoniecz @ENT8R . Let me know, any of you, also if I didn't ping you here, if you are interested in working on this together.

The time to do this is convenient because the whole database scheme changed for this dev version already, so users' data will need to be dropped anyway when upgrading to this version. If the above is done, this would probably happen again, so it would be nice to limit that possible data loss to only one time.

@FloEdelmann
Copy link
Member

@westnordost I'm willing to help but I don't have much time this week. Next week should be better though :)

@westnordost
Copy link
Member Author

That's fantastic! Sure, next week is fine because I stumbled upon some serious performance problem in the dev branch plus an architectural issue I have to take care of (first) anyway.

@westnordost
Copy link
Member Author

I got lost trying to make the whole of StreetComplete based on suspending coroutines. I went back from that a bit, it was too much change in one go.

What I did was to replace the download and the upload with coroutine/suspending code, to get rid of the AtomicBoolean as a cancel flag, cleaned up how coroutines are currently used in the codebase etc.

I went back on already creating the quests while persisting because this can theoretically cause inconsistencies. Need to think about if I should do this and what measures I take to prevent this.

But anyway, I implemented #2521 and made all bulk insertions into the database about 2x faster (nodes, ways, way members, relations, relation members, geometries, osm quests etc etc). Inspired by this medium post
https://medium.com/@JasonWyatt/squeezing-performance-from-sqlite-insertions-971aff98eef2
Bundling inserts in transactions, it's already about 10x faster than without. But I did use this already. But using prepared insert-statements in a transaction, it is yet almost 2x faster.

A NotesDownload: Downloaded 8 notes in 0.2s
A NoteController: Persisted 7 and deleted 0 notes in 0.0s
B MapTilesDownload: Downloaded 36 tiles (510kB downloaded, 978kB already cached) in 0.8s
A OsmAvatarsDownload: Downloaded 5 avatar images in 1.2s
C MapDataDownload: Downloaded 29710 nodes, 4378 ways and 150 relations in 2.9s
C MapDataController: Persisted 34201 and deleted 0 elements and geometries in 2.4s
C OsmQuestController: Created 2208 quests for bbox in 1.8s
C OsmQuestController: Persisted 2079 new and removed 0 already resolved quests in 0.1s
Finished download in 7.3s

I am in no hurry to get this refactor-version out, because it is a big architectural break which effects the users in a negative way (all non-uploaded quests will be deleted once), so I am still collecting more functionality/refactor work until this gets released.

In particular, the refactor mentioned in the last post(s) would be nice, but it is yet another thing that is quite big, so I won't start this for now without help.

What I will start next would be a rework of the undo functionality becuase that is a loose end anyway (it crashes when one tries to undo something).

@smichel17
Copy link
Member

which effects the users in a negative way (all non-uploaded quests will be deleted once),

Can we release a version before it which informs everyone of this? With a scheduled release date for the refactoring, which the app would warn before.

Or, maybe the first X relases with the new architecture can include the old uploader and do a one-time task of uploading quests (require this in order to proceed to use the app)

@westnordost
Copy link
Member Author

Can we release a version before it which informs everyone of this

Yes, I can put a warning in the changelog for v31.1

@westnordost
Copy link
Member Author

Or, maybe the first X relases with the new architecture can include the old uploader and do a one-time task of uploading quests (require this in order to proceed to use the app)

Won't work, because the issue is not the uploader, but the database. The database pretty much changed completely. I tried to write migrations, but it was so much code that I dropped the effort.

@smichel17
Copy link
Member

smichel17 commented Mar 22, 2021

I was thinking something like this:

  • Add a new activity, v32MigrationActivity that expects the old DB
  • When the app starts, check which version of the db exists.
    • If it's the old DB, and there are pending uploads, launch the migration activity.
    • If it's the old DB but there are no pending uploads, drop the old tables and create the new ones.
    • If it's the new db, do nothing.
  • The migration activity has some text explaining "There was a big upgrade, you'll need to upload any pending quests now or they will be lost", and two buttons:
    1. Upload pending answers.
    2. Delete pending answers
  • After clicking one, there will be no pending answers. Return to "When the app starts" and upgrade the tables.

Unlike writing migrations, it's not so much new code to write, so hopefully lower effort. It might be a lot of existing code that needs to be kept around in order to make this approach work; at some point it would make sense to drop it. But it would be nice to keep in for a few versions, I think.

@westnordost
Copy link
Member Author

Still too much effort, in my opinion.

I don't expect that the edits are kept around for weeks, usual use case would be that it's kept for a day and in the evening or something, it is uploaded.

@peternewman
Copy link
Collaborator

Still too much effort, in my opinion.

Can't you just give the new tables new names, and therefore have both in parallel, along with the old upload code in an old class?

I don't expect that the edits are kept around for weeks, usual use case would be that it's kept for a day and in the evening or something, it is uploaded.

Yeah that's probably true.

Yes, I can put a warning in the changelog for v31.1

I saw the warning in the release, personally I think it needs to be a bit stronger and should suggest people turn off automatic upgrades until this release has happened, otherwise there is a risk of data loss if they do some surveying then close the app before they get back online.

Is it also worth adding a nag screen if you're closing the app with data which hasn't been uploaded explaining this is at risk with the new database change?

Presumably people with older versions are at a greater risk as they won't see the changelog beforehand, but perhaps that's quite a small proportion of people? Maybe add some notes on the app stores too?

@FloEdelmann
Copy link
Member

Let me know, any of you, also if I didn't ping you here, if you are interested in working on this together.

@westnordost Now I have some free time and would be able to work on this. Is there something I could work on?

@westnordost
Copy link
Member Author

Yes! There is a branch no-osmapi-data. It contains the new "pure kotlin" data classes.

  1. The current "elements-first-class" needs to be merged into that branch
  2. We also need a pure kotlin data class for Note
  3. The classes MapDataDao, NotesDao etc that usually return classes from osmapi should be wrapped by classes named MapDataApi, NotesApi and these wrapper classes can have the same (or a more convenient) interface as the original ones, only that they return the pure kotlin data classes
  4. Thus, everywhere where the data classes from Osmapi are used, they should be replaced with those data classes.
  5. So to avoid doing everything at once (in one PR), we should leave out kotlinx-serialization and kotlinx-date for now

@westnordost
Copy link
Member Author

It looks like the implementation is complete now. Minus the stuff that @FloEdelmann started, mentioned in #2506 (comment)

What is also missing is the new "history" feature, but the basic refactoring is done.

However, before I release the first alpha for those who would like to help testing, I think I will wait until the further restructuring that @FloEdelmann and me are doing is done.

overview_data

@westnordost
Copy link
Member Author

westnordost commented Mar 30, 2021

Changelog for v32.0 so far:


The grant from the Prototype Fund ended in February, however, there is one last big feature I can delight you with before development on this app must necessarily wind down:
It may not look like a lot, but I basically replaced the whole architecture of the app (#2506, ...) to do this.

🔙 Superpowered Undo

⛰️ Improved usability and offline mode

@westnordost
Copy link
Member Author

westnordost commented Apr 2, 2021

anyway, anything concering the technical refactoring that isn't in other tickets yet is now done. So I am closing this ticket.

The "Layers" feature will probably not be implemented soon. I mentioned it: The fund is over since February and I'll not be able to invest so much time in this project anymore. There'd still be a big chunk left to do to implement that feature - basically the whole UI.
So, what will happen is still this update mentioned above, plus some minor things not mentioned there yet (see the milestone "elements-first-class"): https://github.com/streetcomplete/StreetComplete/milestone/14

@smichel17
Copy link
Member

I appreciate the work you've done here. Adding a feature -- even a very involved feature like the layers UI -- is much more feasible than a total overhaul of the data model. So, I think this was a good use of time, to get the foundational work complete, even if the feature itself is too much to implement right now.

@mnalis
Copy link
Member

mnalis commented Apr 3, 2021

@westnordost I'm somewhat confused; will the the tickets mentioned above then be released soon(-ish), eg. in v32.0, or are they also being postponed indefinitely ?

In any case; much thanks for all your hard work!

@matkoniecz
Copy link
Member

matkoniecz commented Apr 3, 2021

@mnalis this things are already done (unless I am really confused) so - it is not indefinitely postponed

@peternewman
Copy link
Collaborator

Thanks very much for this @westnordost . The offline stuff will certainly be great, but I'm really looking forward to this:

* The above also works when splitting a way. The quests for the split segments will be shown
    immediately

From my experience even when I've had a connection it seems to often take a little while to upload and be processed within OSM before I can retry a query and then actually see and answer the new quest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants