Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on simplifying GeoJSON file #631

Open
sebastiaan6907 opened this issue Apr 2, 2024 · 11 comments
Open

Error on simplifying GeoJSON file #631

sebastiaan6907 opened this issue Apr 2, 2024 · 11 comments

Comments

@sebastiaan6907
Copy link

I tried to simplify a large GeoJSON file with polygons (cadastral parcels) with the following command:
npx mapshaper kadastralekaart_perceelgrenzen.geojson -simplify 80% keep-shapes -o kadastralekaart_perceelgrenzen_simpel.geojson

I received this error message:

<--- Last few GCs --->

[907:0x7face0008000] 103160 ms: Scavenge 3984.7 (4057.3) -> 3983.9 (4079.6) MB, 10.77 / 0.00 ms (average mu = 0.201, current mu = 0.121) allocation failure;
[907:0x7face0008000] 104928 ms: Mark-Compact 4043.7 (4129.9) -> 4009.2 (4106.8) MB, 1724.56 / 0.00 ms (average mu = 0.155, current mu = 0.063) allocation failure; scavenge might not succeed

<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----

1: 0x10087f8d4 node::OOMErrorHandler(char const*, v8::OOMDetails const&) [/usr/local/Cellar/node/21.6.2_1/bin/node]
2: 0x100a0dae7 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [/usr/local/Cellar/node/21.6.2_1/bin/node]
3: 0x100a0da7d v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [/usr/local/Cellar/node/21.6.2_1/bin/node]
4: 0x100ba20ed v8::internal::Heap::FatalProcessOutOfMemory(char const*) [/usr/local/Cellar/node/21.6.2_1/bin/node]
5: 0x100ba4cd0 v8::internal::Heap::ComputeMutatorUtilization(char const*, double, double) [/usr/local/Cellar/node/21.6.2_1/bin/node]
6: 0x100ba4a09 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [/usr/local/Cellar/node/21.6.2_1/bin/node]
7: 0x100ba3c95 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::internal::GarbageCollectionReason, char const*) [/usr/local/Cellar/node/21.6.2_1/bin/node]
8: 0x100ba269f v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags)::$_6::operator()() const [/usr/local/Cellar/node/21.6.2_1/bin/node]
9: 0x100ba2461 void heap::base::Stack::SetMarkerAndCallbackImpl<v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags)::$_6>(heap::base::Stack*, void*, void const*) [/usr/local/Cellar/node/21.6.2_1/bin/node]
10: 0x10078ea1b PushAllRegistersAndIterateStack [/usr/local/Cellar/node/21.6.2_1/bin/node]
11: 0x100ba0f74 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/local/Cellar/node/21.6.2_1/bin/node]
12: 0x100b98b35 v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/Cellar/node/21.6.2_1/bin/node]
13: 0x100b992fc v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/Cellar/node/21.6.2_1/bin/node]
14: 0x100b8186d v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/usr/local/Cellar/node/21.6.2_1/bin/node]
15: 0x100e8d47b v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/usr/local/Cellar/node/21.6.2_1/bin/node]
16: 0x1006701b6 Builtins_CEntry_Return1_ArgvOnStack_NoBuiltinExit [/usr/local/Cellar/node/21.6.2_1/bin/node]
zsh: abort npx mapshaper kadastralekaart_perceelgrenzen.geojson -simplify 80% keep-shape

@geodata4all
Copy link

hi,
this is out of memory issue, use mapshaper-xl. Works the same as mapshaper, but runs with more RAM to support larger files

@sebastiaan6907
Copy link
Author

Hi,
Tried map shaper-xl. Similar error message. GeoJSON file is 4Gb

Last login: Tue Apr 2 23:33:51 on ttys000
Sebastiaan@mbp16vabastiaan PDOK Kadastrale kaart NL % npx mapshaper-xl kadastralekaart_perceelgrenzen_vanuitqgis.geojson -simplify 80% keep-shapes -o kadastralekaart_perceelgrenzen_simpel.geojson
Allocating 8 GB of heap memory

<--- Last few GCs --->

[11115:0x7fa9c8008000] 137502 ms: Scavenge 7893.8 (8023.9) -> 7893.5 (8034.9) MB, 10.63 / 0.00 ms (average mu = 0.170, current mu = 0.081) allocation failure;
[11115:0x7fa9c8008000] 137535 ms: Scavenge 7900.4 (8034.9) -> 7900.6 (8035.6) MB, 12.25 / 0.00 ms (average mu = 0.170, current mu = 0.081) allocation failure;
[11115:0x7fa9c8008000] 138012 ms: Scavenge 7901.3 (8035.6) -> 7900.5 (8057.9) MB, 474.26 / 0.00 ms (average mu = 0.170, current mu = 0.081) allocation failure;

<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----

1: 0x10e80a8d4 node::OOMErrorHandler(char const*, v8::OOMDetails const&) [/usr/local/Cellar/node/21.6.2_1/bin/node]
2: 0x10e998ae7 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [/usr/local/Cellar/node/21.6.2_1/bin/node]
3: 0x10e998a7d v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [/usr/local/Cellar/node/21.6.2_1/bin/node]
4: 0x10eb2d0ed v8::internal::Heap::FatalProcessOutOfMemory(char const*) [/usr/local/Cellar/node/21.6.2_1/bin/node]
5: 0x10eb2c06a v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/local/Cellar/node/21.6.2_1/bin/node]
6: 0x10eb23b35 v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/Cellar/node/21.6.2_1/bin/node]
7: 0x10eb242fc v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/Cellar/node/21.6.2_1/bin/node]
8: 0x10eb0c86d v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/usr/local/Cellar/node/21.6.2_1/bin/node]
9: 0x10ee1847b v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/usr/local/Cellar/node/21.6.2_1/bin/node]
10: 0x10e5fb1b6 Builtins_CEntry_Return1_ArgvOnStack_NoBuiltinExit [/usr/local/Cellar/node/21.6.2_1/bin/node]

@mbloch
Copy link
Owner

mbloch commented Apr 3, 2024

4gb is reaching the limit of what mapshaper can handle. You'll need more heap memory than the 8gb that mapshaper-xl gives you by default. If your computer has a lot of RAM, you can bump that up even more: mapshaper-xl 32gb ... or mapshaper-xl 64gb ... etc.

Some more tips on reducing file size: If your dataset has many features and many data properties for each feature, you might also try using the -filter-fields command to keep only the properties that you need... of course you need to import the file first before you can run that command. You can also reduce the precision of your output coordinates, for example -o precision=0.00001. good luck!

@sebastiaan6907
Copy link
Author

sebastiaan6907 commented Apr 3, 2024

Thanks for the swift and detailed response.
My MacBook Pro holds 32Gb RAM, so I'll try to fire it up a bit.

The original gml file is 25Gb and holds 8.2 million polygons (all cadastral parcel boundaries of the Netherlands).
By using ogr2ogr or QGIS I managed to reproject from EPSG:28992 to EPGS:4326, extract only the polygon coordinates and reduced precision to 7 decimals. This resulted in a GeoJSON file of 4.37Gb
In my workflow I also have to split this GeoJSON file in 15 pieces of max 300Mb for uploading them as tile sets in Mapbox. I intend to use geojsplit for this task with the command: geojsplit --geometry-count 552930.
My first attempts are to simplify the large GeoJSON as a whole, but if the memory issue stays, I could also simplify the 15 smaller pieces. Unfortunately I have no experience in scripting this task and have no knowledge of "Makefile".

Do you have any reflections or suggestions on this workflow in general? Or can you point to a resource where I can easily find out how to script the repetition of the following command for 15 files:
npx mapshaper part_1.geojson -simplify 80% keep-shapes -o part_simplified_1.geojson
npx mapshaper part_2.geojson -simplify 80% keep-shapes -o part_simplified_2.geojson
npx mapshaper part_3.geojson -simplify 80% keep-shapes -o part_simplified_3.geojson
Etc.

I expect to run this workflow every month as the parcel boundaries keep changing regularly.
Regards, Sebastiaan

Meanwhile I tried with 16Gb memory allocation, but still an error occurs.
I am no expert in these error messages, but perhaps the task at hand is too large?

Sebastiaan@mbp16vabastiaan Percelen op de kaart % npx mapshaper-xl 16gb kadastralekaart_perceelgrenzen.geojson -simplify 80% keep-shapes -o kadastralekaart_perceelgrenzen_simpel.geojson
Allocating 16 GB of heap memory
[simplify] Repaired 79 intersections; 3 intersections could not be repaired
RangeError [ERR_OUT_OF_RANGE]: The value of "length" is out of range. It must be >= 0 && <= 2147483647. Received 3210279045
at Object.writeSync (node:fs:922:5)
at Object.writeFileSync (node:fs:2362:26)
at module.exports [as writeFileSync] (/Users/Sebastiaan/node_modules/rw/lib/rw/write-file-sync.js:14:8)
at cli.writeFile (/Users/Sebastiaan/node_modules/mapshaper/mapshaper.js:11598:10)
at /Users/Sebastiaan/node_modules/mapshaper/mapshaper.js:11733:13
at Array.forEach ()
at _writeFiles (/Users/Sebastiaan/node_modules/mapshaper/mapshaper.js:11720:15)
at writeFiles (/Users/Sebastiaan/node_modules/mapshaper/mapshaper.js:11694:12)
at /Users/Sebastiaan/node_modules/mapshaper/mapshaper.js:334:17
at new Promise () {
code: 'ERR_OUT_OF_RANGE'
}

@sebastiaan6907
Copy link
Author

sebastiaan6907 commented Apr 8, 2024

Matthew, if you have time to check the above comment, that would be great!

@sebastiaan6907
Copy link
Author

In reference to issue #523 #523 I analysed the number of coordinates in the polygons and the largest polygon holds a little over 7000 coordinates.
Scherm­afbeelding 2024-04-14 om 20 52 56

Removing the largest two polygons did not give any improvement and resulted in the same error message.

@mbloch
Copy link
Owner

mbloch commented Apr 16, 2024

Hi, the key information in the error message is this:

RangeError [ERR_OUT_OF_RANGE]: The value of "length" is out of range. It must be >= 0 && <= 2147483647. Received 3210279045
at Object.writeSync (node:fs:922:5)

This indicates that Mapshaper was able to process the file, and failed on the very last step, writing an output file to disk. Mapshaper is able to read very large GeoJSON files, but can only write files up to 2GB. I never took the time to implement writing larger files than this, mostly because my colleagues and I never work with GeoJSON files this big. Your output file is about 3GB in size, so, 50% too large for Mapshaper to write. I'm not sure what the best approach to solve your problem would be... I'll take a look at the GeoJSON export code and see what can be done.

@sebastiaan6907
Copy link
Author

Thx, I'll keep waiting for your further response

@mbloch
Copy link
Owner

mbloch commented Apr 16, 2024

I just released v0.6.90, which supports writing files up to 4gb (the previous limit was 2gb). The last error you reported was triggered by trying to write a file that was ~3gb, so that command should now work. Going up to 4gb was a simple fix. Exporting files >4gb would take a lot more work, I'm going to postpone that to a future date.

@sebastiaan6907
Copy link
Author

Thanx!! No more fatal errors noticed!
Some warnings did come across though. When running with the -verbose option, the output looks like:

npx mapshaper-xl 24gb kadastralekaart_perceelgrenzen_kopie.geojson -simplify 80% keep-shapes -o kadastralekaart_perceelgrenzen_simpel.geojson -verbose
Allocating 24 GB of heap memory
[i] Importing: kadastralekaart_perceelgrenzen_kopie.geojson
[simplify] [protectMultiRing()] Failed on ring: -17633,-17540,-17637
[simplify] [protectMultiRing()] Failed on ring: 41264,-41263
[simplify] [protectMultiRing()] Failed on ring: -49585,49588,49589
[simplify] [protectMultiRing()] Failed on ring: 49603,-49602,-49595
[simplify] [protectMultiRing()] Failed on ring: -50486,-50934
....
....
[simplify] [protectMultiRing()] Failed on ring: 23328881,-23328879
[simplify] [protectMultiRing()] Failed on ring: 23329579,-23310199,23329580
[simplify] [protectMultiRing()] Failed on ring: -23329809,23329816
[simplify] [protectMultiRing()] Failed on ring: -23331175,23331179
[simplify] Repaired 79 intersections; 3 intersections could not be repaired
[o] Wrote kadastralekaart_perceelgrenzen_simpel.geojson

Is there a way to find out which 3 intersections could not be repaired and possibly fix the source on these polygons or is this not worth the effort?
And what about the 9.910 rows that mention ..... Failed on ring. How serious is this warning or what does it actually mean? And is it worthwhile to see if I can improve the source with regard to these warnings?

@sebastiaan6907
Copy link
Author

Matthew, can you reflect on the warnings mentioned above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants