supabase functions serve runs out of memory and crashes with basic usage #212

meyer9 · 2023-10-16T23:58:12Z

Describe the bug
The edge-runtime does not terminate the worker for a few minutes after starting. This causes a pretty severe memory leak since a new worker is created for each request and never terminated: https://github.com/supabase/cli/blob/a0c8644deeef5a72f99687cc897eadc3dce256f1/internal/functions/serve/templates/main.ts#L135

This makes it effectively impossible to use supabase functions serve for local development.

To Reproduce

Start worker runtime
Send ~1000 requests within a few mins
Notice extremely high memory usage on the edge-runtime and crash.

Expected behavior
I expect the worker runtimes to be cleaned up after the request is complete if a new worker runtime is started for each request. Even nicer would be to reuse a single worker runtime and refresh it when files change, similar to deno --watch.

Screenshots

Desktop (please complete the following information):

OS: macOS 14.0 (13-inch, M2, 2022)
Version of supabase-js: beta (1.102.2)
Version of Node.js: v20.5.0

Additional context
Disabling forceCreate solves the crashing issue, but breaks auto-reload.

I filed an issue on the edge-runtime here since I'm not sure which should be fixed to resolve the crashing problem. #192

The text was updated successfully, but these errors were encountered:

bombillazo · 2023-10-17T16:00:38Z

We've noticed this to, having to start the serve functionality periodically when the edge functions dies.

sweatybridge · 2023-11-09T04:39:55Z

Transferring to edge-runtime repo since it likely requires changes to the container.

nyannyacha · 2023-12-13T16:09:16Z

Hey! @meyer9

The root reason causing the leak memory is at the Deno code base I think.
I've already submitted such concerns to their repository, And they said they would rework the problematic parts over the next few weeks 😋

denoland/deno_core#386 (comment)

This policy forces the supervisor to terminate the isolation immediately if the request is complete. Using this policy with development will make sense because it terminates the isolation immediately if the request is complete, so developers will not have to restart runtime. This commit solves cases such as supabase#192 and supabase#212

I had to use the cargo patch to fix the memory leakage problem because the root cause of the memory leak belonged to `deno_core`. Eventually, these changes should be tracked at `deno_core`; so until fixing this problem upstream, we have to use the patch. It could be the substantial solution for supabase#212 and supabase#192 (on the assumption that I found all memory leakage places of `JsRuntime` 😋 For reference, Valgrind no longer reported definite memory leakage after this patch)

jeremyisatrecharm · 2024-01-03T04:04:19Z

Until then, are there any suggestions on how to do some sort of hacky reboot-serve every n requests without the server responses failing?

nyannyacha · 2024-01-03T04:42:21Z

Hi! @jeremyisatrecharm 😋

Yeah, the Deno team seems to be taking longer time than I expected to fix the memory leak. It may not be a priority for them.

So, I've already written some commits to fix the memory leak into my fork. However, since these changes modify the upstream directly, it may be necessary to talk with the supabase team about whether to accept this.

Just in time, @laktek is back from holiday, so I'd like to take the time to discuss this 😁

I had to use the cargo patch to fix the memory leakage problem because the root cause of the memory leak belonged to `deno_core`. Eventually, these changes should be tracked at `deno_core`; so until fixing this problem upstream, we have to use the patch. It could be the substantial solution for supabase#212 and supabase#192 (on the assumption that I found all memory leakage places of `JsRuntime` 😋 For reference, Valgrind no longer reported definite memory leakage after this patch) (cherry picked from commit bc631b4)

This policy forces the supervisor to terminate the isolation immediately if the request is complete. Using this policy with development will make sense because it terminates the isolation immediately if the request is complete, so developers will not have to restart runtime. This commit solves cases such as supabase#192 and supabase#212 (cherry picked from commit 0b1ddd0)

I had to use the cargo patch to fix the memory leakage problem because the root cause of the memory leak belonged to `deno_core`. Eventually, these changes should be tracked at `deno_core`; so until fixing this problem upstream, we have to use the patch. It could be the substantial solution for supabase#212 and supabase#192 (on the assumption that I found all memory leakage places of `JsRuntime` 😋 For reference, Valgrind no longer reported definite memory leakage after this patch) (cherry picked from commit bc631b4) # Conflicts: # Cargo.lock # Cargo.toml

I had to use the cargo patch to fix the memory leakage problem because the root cause of the memory leak belonged to `deno_core`. Eventually, these changes should be tracked at `deno_core`; so until fixing this problem upstream, we have to use the patch. It could be the substantial solution for supabase#212 and supabase#192 (on the assumption that I found all memory leakage places of `JsRuntime` 😋 For reference, Valgrind no longer reported definite memory leakage after this patch) (cherry picked from commit bc631b4)

JTInfinite · 2024-03-22T17:06:41Z

I'm running into this issue as well (at least I think it is the same issue) - I can't run any function that attempts to work any sort of embeddings. Each invocation fails with:

CPU time hard limit reached. isolate: bd382b3a-43a9-41de-87dc-2aa3ec7b8524 ReferenceError: Status is not defined at Server.<anonymous> (file:///home/deno/main/index.ts:164:13) at eventLoopTick (ext:core/01_core.js:64:7) at async #respond (https://deno.land/std@0.182.0/http/server.ts:220:18) failed to send request to user worker: request has been cancelled by supervisor user worker failed to respond: request has been cancelled by supervisor WorkerRequestCancelled: request has been cancelled by supervisor at async Promise.allSettled (index 1) at async UserWorker.fetch (ext:sb_user_workers/user_workers.js:70:21) at async Server.<anonymous> (file:///home/deno/main/index.ts:146:12) at async #respond (https://deno.land/std@0.182.0/http/server.ts:220:18) { name: "WorkerRequestCancelled" } ReferenceError: Status is not defined at Server.<anonymous> (file:///home/deno/main/index.ts:164:13) at eventLoopTick (ext:core/01_core.js:64:7) at async #respond (https://deno.land/std@0.182.0/http/server.ts:220:18)

prvind-panday · 2024-04-03T13:04:00Z

ese changes modify the upstream directly, it may be necessary to talk with the supabase team about whether to accept

I also face the same issue when I run my supabase edge function. It ran perfectly fine for a few seconds and then I got the below response in Postman

{ "message": "The upstream server is timing out" }

Check the below image or attached image for the postman reference

And in the console I see the below response

CPU time hard limit reached. isolate: 3cbe8cb8-d7de-4bc8-8eca-7273473cf2dc failed to send request to user worker: request has been cancelled by supervisor user worker failed to respond: request has been cancelled by supervisor WorkerRequestCancelled: request has been cancelled by supervisor at async Promise.allSettled (index 1) at async UserWorker.fetch (ext:sb_user_workers/user_workers.js:70:21) at async Server.<anonymous> (file:///home/deno/main/index.ts:146:12) at async #respond (https://deno.land/std@0.182.0/http/server.ts:220:18) { name: "WorkerRequestCancelled" } ReferenceError: Status is not defined at Server.<anonymous> (file:///home/deno/main/index.ts:164:13) at eventLoopTick (ext:core/01_core.js:64:7) at async #respond (https://deno.land/std@0.182.0/http/server.ts:220:18) serving the request with /home/deno/functions/parse_foca_geojson

Did anyone find any solution to this? Is this issue related to supabase or docker itself?

AntonOfTheWoods · 2024-05-21T09:10:51Z

There seem to be a few issues mentioned here but the project documentation is completely absent (except for the examples), so this ticket appears to be accumulating a lot of cruft...

If you are getting errors like :

CPU time hard limit reached...

Then make sure you are passing sufficiently large limits to your worker. See https://github.com/supabase/edge-runtime/blob/main/examples/main/index.ts#L98 for an example. The defaults appear to be set extremely low (1000ms or something) so they are easy to hit if you are doing anything serious. Have a look at all the options. Increasing these made all my issues go away.

sweatybridge transferred this issue from supabase/cli Nov 9, 2023

nyannyacha mentioned this issue Dec 12, 2023

feat: supervisor needs to track lifetime of requests to perform early termination of worker #232

Closed

nyannyacha mentioned this issue Jan 8, 2024

fix: Identified and fixed a memory leak from JsRuntime #240

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

supabase functions serve runs out of memory and crashes with basic usage #212

supabase functions serve runs out of memory and crashes with basic usage #212

meyer9 commented Oct 16, 2023 •

edited

bombillazo commented Oct 17, 2023

sweatybridge commented Nov 9, 2023

nyannyacha commented Dec 13, 2023

jeremyisatrecharm commented Jan 3, 2024

nyannyacha commented Jan 3, 2024

JTInfinite commented Mar 22, 2024

prvind-panday commented Apr 3, 2024

AntonOfTheWoods commented May 21, 2024

supabase functions serve runs out of memory and crashes with basic usage #212

supabase functions serve runs out of memory and crashes with basic usage #212

Comments

meyer9 commented Oct 16, 2023 • edited

bombillazo commented Oct 17, 2023

sweatybridge commented Nov 9, 2023

nyannyacha commented Dec 13, 2023

jeremyisatrecharm commented Jan 3, 2024

nyannyacha commented Jan 3, 2024

JTInfinite commented Mar 22, 2024

prvind-panday commented Apr 3, 2024

AntonOfTheWoods commented May 21, 2024

meyer9 commented Oct 16, 2023 •

edited