Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specific user note leading to hang on downloading cache data #15687

Open
moving-bits opened this issue May 3, 2024 · 9 comments
Open

Specific user note leading to hang on downloading cache data #15687

moving-bits opened this issue May 3, 2024 · 9 comments
Labels
Bug Issues classified as a bug Support Issue related to one or more Support Tickets

Comments

@moving-bits
Copy link
Member

User report on support (ticket 539239):

A cache with a specific user note text leads to c:geo hanging on downloading this cache (both initial download or update).
Reproducible for me with current release 2024.04.25

User note consists of app. 50+ tuplets of numbers (eg: 1,2 7,10 20,15 ...) - probably our "coordinates from user note" algo hitting a infinite loop here?

@moving-bits moving-bits added Bug Issues classified as a bug Support Issue related to one or more Support Tickets labels May 3, 2024
@moving-bits
Copy link
Member Author

Additional info:

  • "downloading" itself seems to work, but once you have downloaded it, you won't be able to ever open this cache again...
  • Even disabling settings => cache details => waypoint extraction does not help here.

@eddiemuc
Copy link
Contributor

eddiemuc commented May 4, 2024

Question to @moving-bits , @Lineflyer : given a ticket number (e.g. 539239) and access to support system: what is the quickest way to open the ticket? (I didn't find a "quick search by ticket id" in the system...)

@moving-bits
Copy link
Member Author

There's a general purpose sesrchbox above the ticket list, just drop the ticket number there.

@eddiemuc
Copy link
Contributor

eddiemuc commented May 4, 2024

I analyzed this. This is not an endless loop but processing time is really really really long....
Something very bad is happening here....

  • When first scanned, c:geo starts creating waypoints for each pair of "decimal numbers" (e.g. 1,2 7,10 gets interpreted as N1.2 E7.10). For each waypoints the remainder of the note is interpreted as user note and stored at the waypoint.
  • When the cache is reopened then all user notes of all waypoints are combined and scanned for waypoints again. This results in a huge list (about 50KB) of decimal pairs being scanned. This takes an awful lot of time

So we have two problems here:

  • huge amount of meaningless waypoints being created
  • the whole creating/scanning process is far too slow and happens on the main thread

@MagpieFourtyTwo
Copy link

Why does c:geo even try to interpret strings as coordinates that do not contain [N|S] and [E|W]?
Sure, there are notations without, but their use is most probably so rare that this will hardly outweigh the problems that arise.

@eddiemuc
Copy link
Contributor

eddiemuc commented May 6, 2024

Why does c:geo even try to interpret strings as coordinates that do not contain [N|S] and [E|W]?
Sure, there are notations without, but their use is most probably so rare that this will hardly outweigh the problems that arise.

That would surely solve the problem for all practical purposes I can think of. The problem would still appear if someone posts a same-length user note where all numbers are predicted with N and E alternatively. But that's unlikely I guess.

Why we parse pairs of decimals as coordinates? I don't know. I have the feeling it "was always like this". I have no opinion whether removing this would be ok or not.

@wolverine007
Copy link

Was facing a few days ago a similar issue, where cgeo took my note and added some strange waypoints basing on the note content which contained some decimal numbers as well.

So I suggest to remove this behavior or add another identifier like # or / before a group of decimals which should be interpreted as coords

@eddiemuc
Copy link
Contributor

eddiemuc commented May 26, 2024

PR #15744 provides a quickfix for release for this by simply limiting the maximum number of parsed coordinates in any user note to 20.

A deeper solution for this issue should be done in master because it might produce unwanted side effects.

The culprit is the "summing up" of created waypoint's user notes (which contain again all coordinates), the long parsing time in method GeopointParser.parseAll and the storing of the created 5840 (!) waypoints. We have currently 8 parsers, so approx 8*5840 = 43304 parsing attempts are made! After this, 5840 waypoints are stored to the database in a single transaction each! --> hence the big performace issues.

I suggest the following measurements:

  • parse only one coordinate per line (separated by newline character)
  • parse waypoint user notes only if they are not contained (text) within the user note (this happens if waypoint is created from user note). Reduces a great amount of double-parsing
  • reduce the number of parsers greatly (we have 8 parsers in total, 3 of those are dedicated to decimal coordinates only!). Cutting down e.g. 2 of 8 parsers reduces parsing time by 25%
  • reduce current preparations of text for coordinate parsing to replacing commas surrounded by decimals with a dot (e.g. N40 12,3 great restaurant nearby would then be parsed as N40 12.300, but N40 12, 3 great restaurants nearby would be parsed as N40 12.000). This reduces parsing time ny 50%
  • make AbstractParser.parse() take an index (not just a text) to reduce creation of strings during parsing
  • Refactor: Unify methods GeopointParser.parseAll and GeopointParser.parse
  • Refactor: remove ability of AbstractParser to also parse either Lat of Lon solely (public abstract ResultWrapper parse(@NonNull String text, @NonNull Geopoint.LatLon latlon)). This is not used meaningfully any more (just in some Connectors which are too lazy to parse a string decimal themselves :-)) and clutters the code greatly

@MagpieFourtyTwo
Copy link

MagpieFourtyTwo commented May 28, 2024

I second @eddiemuc's proposal - although I would even go a step further and limit coordinate parsing to strings which must contain [N|S] and [E|W] (although this may be covered by refactoring AbstractParser.parse()). Moreover, to keep things simple, I would not just parse one coord per line, but add. only one line per coord, i. e. would not allow CR/LFs within coordinates. Which may on first sight sound a bit harsh, but could in fact easily be understood by users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Issues classified as a bug Support Issue related to one or more Support Tickets
Projects
None yet
Development

No branches or pull requests

4 participants