Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential memory leak in polygonToCells #168

Open
jonjardine opened this issue Jan 12, 2023 · 10 comments
Open

Potential memory leak in polygonToCells #168

jonjardine opened this issue Jan 12, 2023 · 10 comments

Comments

@jonjardine
Copy link

jonjardine commented Jan 12, 2023

I am processing image files into h3 cells which convert a bounding box for each pixel in the Geo TIFF file into h3 cells at the specified resolution (in this case res 10). The code works perfectly for a while, but after a certain amount of processing memory errors occur.

For example, with these bounds:

E -69.36043436899354, W -69.36251770199354, N 81.41166741399798, S 81.40958408099797

this error is eventually thrown AFTER a large number of bounding boxes, at the same size / resolution, have been successfully processed:

Error: Memory allocation failed (code: 13)
    at createError (.../node_modules/h3-js/dist/h3-js.js:13664:13)
    at H3LibraryError (.../node_modules/h3-js/dist/h3-js.js:13683:10)
    at throwIfError (.../node_modules/h3-js/dist/h3-js.js:13709:11)
    at Object.polygonToCells (.../node_modules/h3-js/dist/h3-js.js:14757:7)
    at ...
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  code: 13
code 13

(more errors)

OOM
OOM
abort(OOM). Build with -s ASSERTIONS=1 for more info.

Similarly:

E -69.35835103599354, W -69.36043436899354, N 81.41166741399798, S 81.40958408099797

Error: Array length out of bounds (code: 1001, value: 16544021374)
    at createError (.../node_modules/h3-js/dist/h3-js.js:13664:13)
    at JSBindingError (.../node_modules/h3-js/dist/h3-js.js:13698:10)
    at validateArrayLength (.../node_modules/h3-js/dist/h3-js.js:13866:11)
    at Object.polygonToCells (.../node_modules/h3-js/dist/h3-js.js:14752:17)
    at ...
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  code: 1001
}

Here's an extract from the code that fails:

let bounds = [[
    [ west, south ],
    [ east, south ],
    [ east, north ],
    [ west, north ],
    [ west, south ]
]];
hextilles = h3.polygonToCells(bounds, 10, true);

The code starts to fail after around c. 10-20 millions calls to polygonToCells. Note the same issue happens with both versions 3 and 4 of the library.

Encountering the issue with Node v 17.8.0, on both Intel and Apple Silicon Macs.

@nrabinowitz
Copy link
Collaborator

Thanks for a really detailed bug report. This does sound like a library issue, though unclear whether the issue is in the wrapper code or the transpiled library. I'll try to reproduce on my end.

@jonjardine
Copy link
Author

I appreciate your prompt response. I can provide you with some code if it helps debug the issue further.

@jonjardine
Copy link
Author

@nrabinowitz did you manage to have a look at this issue? I am double-checking my code, but still encountering the same problem with version 4.1.0 of h3-js. Appreciate any update. Thanks.

@nrabinowitz
Copy link
Collaborator

Sorry, haven't had a chance to look yet. I'll try to find time this week.

@jonjardine
Copy link
Author

jonjardine commented Sep 14, 2023

@nrabinowitz I have written separate code to reproduce the issue (running under Node.js v20.6.0). This simple script logs the memory usage as the number of calls made to polygonToCells increases.

import h3 from 'h3-js';

// Simple function to log memory usage
function memoryState() {
    const formatMemoryUsage = (data) => `${Math.round(data / 1024 / 1024 * 100) / 100} MB`;
    const memoryData = process.memoryUsage();
    const memoryUsage = {
        rss: `${formatMemoryUsage(memoryData.rss)}`, // -> Resident Set Size - total memory allocated for the process execution
        heapTotal: `${formatMemoryUsage(memoryData.heapTotal)}`, // -> total size of the allocated heap
        heapUsed: `${formatMemoryUsage(memoryData.heapUsed)}`, // -> actual memory used during the execution
        external: `${formatMemoryUsage(memoryData.external)}`, // -> V8 external memory
    };
    console.log(`RSS: ${memoryUsage.rss} | Heap Total: ${memoryUsage.heapTotal} | Heap Used: ${memoryUsage.heapUsed} | External: ${memoryUsage.external}`);
}

const divisions = 1024;
const dist = 1 / divisions;
const minimumResolution = 4;
const maximumResolution = 11;

// Loop through a grid of latitudes and longitudes, and process all the tiles within a small
// region of that grid (grid split into 1,024 x 1,024 regions)
// Get all the h3 indices within each region
// 
for (let lng = -180.0; lng < 180.0; lng = lng + 1) {
    
    var totalTiles = 0;
    
    for (let lat = -90; lat < 90.0; lat = lat + 1) {
        console.log(`Longitude: ${lng} | Latitude: ${lat}`);
        var south = lat;
        var runs = 0;
        let start = new Date();
        for (let y = 0; y < divisions; y++) {
            var west = lng;
            for (let x = 0; x < divisions; x++) {
                const east = west + dist;
                const north = south + dist;
                const bounds = [
                    [ west, south ],
                    [ east, south ],
                    [ east, north ],
                    [ west, north ],
                    [ west, south ]
                ];
                const latLngBounds = [
                    [ south, west ],
                    [ south, east ],
                    [ north, east ],
                    [ north, west ],
                    [ south, west ]
                ];
                for (let resolution = minimumResolution; resolution <= maximumResolution; resolution++) {
                    let tiles = h3.polygonToCells(latLngBounds, resolution);
                    totalTiles += tiles.length;
                    runs++;
                } 
                west += dist;
            }
            if (y % 64 == 0) {
                console.log(`+ ${y} of ${divisions} -> ${totalTiles}`);
            }
            south += dist;
        }
        let end = new Date();
        let diff = end.getTime() - start.getTime();
        console.log(`>>> Total H3 calls: ${runs} - ${totalTiles} tiles returned => (${diff} ms taken)`);
        memoryState();
    }
}


First grid square passes with memory usage:
RSS: 1904.41 MB | Heap Total: 27.42 MB | Heap Used: 16.25 MB | External: 1540.01 MB

Second grid square passes with memory usage:
RSS: 4352.47 MB | Heap Total: 27.42 MB | Heap Used: 12.95 MB | External: 3348.01 MB

Third grid square fails:

/node_modules/h3-js/dist/h3-js.js:13664
  var err = new Error(((messages[errCode] || UNKNOWN_ERROR_MSG) + " (code: " + errCode + (hasValue ? (", value: " + (meta.value)) : '') + ")")); // @ts-expect-error - TS doesn't like extending Error
            ^

Error: Memory allocation failed (code: 13)
    at createError (/Volumes/----/node_modules/h3-js/dist/h3-js.js:13664:13)
    at H3LibraryError (/Volumes/----/node_modules/h3-js/dist/h3-js.js:13683:10)
    at throwIfError (/Volumes/----a/node_modules/h3-js/dist/h3-js.js:13709:11)
    at Object.polygonToCells (/Volumes/----/node_modules/h3-js/dist/h3-js.js:14757:7)

(Note the bounds and latLngBounds were to compare if using the GeoJSON formatting made a difference - it does not)

@jonjardine
Copy link
Author

After some further testing, calling polygonToCells multiple times with lower resolutions (especially using these very small regions), seems to cause the memory usage to balloon more quickly than at higher resolutions, e.g. running from res 4 to 9 crashes quicker than from res 10-11. This result possibly suggests that there's memory not being dealloc'd if zero cells are detected and the methods ends early?

@nrabinowitz
Copy link
Collaborator

Thanks for the repro script - looking now

@nrabinowitz
Copy link
Collaborator

nrabinowitz commented Sep 14, 2023

I can repro the issue with your script. Just to document my understanding so far:

  • While the numbers don't always seem to add up, it seems like RSS includes external memory.
  • The growth in external memory is almost entirely array buffers, which in this case I believe to be the automatic memory growth of the virtual heap managed by Emscripten.
  • To be clear, the failed allocation message is not a "true" OOM - Emscripten manages its own virtual memory in typed arrays, and grows them as needed.
  • The error seems to come up when the array buffers size is approximately 4GB. I would assume this is either set internally by Emscripten or (less likely?) due to the memory allocated to the Node process. Either way, it seems to occur when Emscripten is asked to allocate enough memory to require resizing the virtual heap, and it can't do so.
  • Memory allocation happens in a few places in polygonToCells:
    • The binding allocates memory for the polygon (small in this case, just the bounding box)
    • The binding allocates memory for the function output, based on the estimated hex count
    • The library allocates memory for the two buffers used in the algorithm, each as large as the output, as well as a smaller chunk of memory for the polygon bounding box
  • Interestingly, we never seem to get an error from the binding allocations - the error is always the H3 library error, not an Emscripten error, though it's possible that we aren't checking properly in the binding for whether _malloc succeeded. I'm not certain whether this means that the leak is necessarily from the library not the binding.
  • Running with a fixed bounding box and a fixed resolution shows the same behavior. The 4GB is filled up after about 2M 3.6M 20M calls, even if no cells are returned from the function, which suggests that the two buffers used in the algorithm, and the output buffer, probably are not involved, as these would scale with the number of cells in the estimated output.

So... possibly it's the memory allocated for the polygon, which isn't getting freed properly? I'll continue to investigate.

@nrabinowitz
Copy link
Collaborator

nrabinowitz commented Sep 14, 2023

After some further testing, calling polygonToCells multiple times with lower resolutions (especially using these very small regions), seems to cause the memory usage to balloon more quickly than at higher resolutions, e.g. running from res 4 to 9 crashes quicker than from res 10-11. This result possibly suggests that there's memory not being dealloc'd if zero cells are detected and the methods ends early?

I think this may simply be because the coarse resolutions run faster, and so you get to 3.6M 20M calls more quickly

@nrabinowitz
Copy link
Collaborator

New observation: The number of calls required to get to OOM varies inversely with the size of the polygon - a more complex polygon with 65 vertexes failed in only 1.9M calls (vs the updated, I think correct number of 20M for the 4-vertex bbox). That suggests strongly that it's the polygon that isn't being deallocated correctly, and specifically the vertex loop. The polygon is also much more complicated to deallocate, so that tracks, but I'm going to need to investigate further to see if I can find what's going wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants