Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory leak related to uv.send_async #505

Open
FelipeLema opened this issue Jul 7, 2020 · 6 comments
Open

memory leak related to uv.send_async #505

FelipeLema opened this issue Jul 7, 2020 · 6 comments
Labels

Comments

@FelipeLema
Copy link

FelipeLema commented Jul 7, 2020

I made an ad-hoc by-line process output reader and it seems that I found a memory leak.

I wrapped into a min working example below. This is supposed to uv.async_send the lines of the output of a certain process. I setup a process that prints data endlessly. The output is stored in current_chunk, but it's only temporary: each time data arrives, I look at the chunk and extract the lines, except for the last one (in case the last line did not arrive completely).

If you keep the uv.send_async line below, memory consumption of the lua process will grow endlessly. If you remove it, the memory consumption remains still. In both cases, each line is discarded.

local uv = require("luv") -- "luv" when stand-alone, "uv" in luvi apps-- WIP luv (related to luvit)

local stdout = uv.new_pipe()
local handle, pid =
   uv.spawn(
      "yes",
      {stdio = {nil, stdout, nil},
       args = {}
      },
      function(code,signal)
         -- ignore
      end)

local current_chunk  = "" -- accumulated data until a \n
local no_more_output = false
-- process each line
cb = uv.new_async(function(line)
      -- discard line
      if no_more_output then
         cb:close()
      end
end)

uv.read_start(stdout, function(err, data)
                 assert(not err, err)
                 if data then
                    -- append received data into current_chunk
                    current_chunk = string.format("%s%s", current_chunk, data)
                 else
                    no_more_output= true
                 end

                 -- split chunk into lines to send to `callback`
                 lines={}
                 for line in string.gmatch(current_chunk, "[^\n]+") do
                       table.insert(lines, line)
                    end
                    if not no_more_output then
                       -- pop last chunk back into current chunk for next iteration
                       last_item_position = #lines
                       current_chunk = table.remove(lines, last_item_position)
                    end 

                    -- send the lines to the callback
                    for _,l in pairs(lines) do
                       -- this line below creates a memory leak
                       uv.async_send(cb, l)
                    end

                    -- close uv stuff if everything's done
                    if no_more_output then
                       uv.async_send(cb, nil)
                    end

end)
uv.run()

Happens with these lua versions:

% lua -v
Lua 5.3.3  Copyright (C) 1994-2016 Lua.org, PUC-Rio
% luajit -v
LuaJIT 2.1.0-beta3 -- Copyright (C) 2005-2017 Mike Pall. http://luajit.org/
@SinisterRectus SinisterRectus mentioned this issue Oct 22, 2020
@squeek502
Copy link
Member

As far as I can tell, what's happening here is this, from the Libuv docs:

Warning: libuv will coalesce calls to uv_async_send(), that is, not every call to it will yield an execution of the callback. For example: if uv_async_send() is called 5 times in a row before the callback is called, the callback will only be called once. If uv_async_send() is called again after the callback was called, it will be called again.

Luv currently expects a callback for every call of async_send to be able to free the memory for the data sent. Since not all sends are actually getting a callback called, though, that memory is never being freed.

With your example code, a script that prints the numbers 1-100, and a couple prints added:

read_start cb	"0\n"
lines to be sent	{  }
read_start cb	"1\n2\n3\n4\n5\n6\n"
lines to be sent	{ "01", "2", "3", "4", "5" }
read_start cb	"7\n8\n9\n10\n11\n12\n13\n"
lines to be sent	{ "67", "8", "9", "10", "11", "12" }
async cb	{ "12" }
read_start cb	"14\n15\n16\n17\n18\n19\n20\n"
lines to be sent	{ "1314", "15", "16", "17", "18", "19" }
read_start cb	"21\n22\n23\n24\n25\n"
lines to be sent	{ "2021", "22", "23", "24" }
async cb	{ "24" }
...

That's 2 async_cb's called for around 24 async_send calls.

I'm not totally sure what the solution here would be, though.

@squeek502 squeek502 added the bug label Oct 22, 2020
@squeek502
Copy link
Member

squeek502 commented Oct 22, 2020

This is actually both a memory leak and a bug, as we are dropping data here that we shouldn't be. Minimal test case:

  test("multiple sends coalesce", function(print, p, expect, uv)
    local async
    async = uv.new_async(expect(function(data)
      p(data)
      uv.close(async)
    end))
    uv.async_send(async, 'this string will be lost')
    uv.async_send(async, 'this string will be sent')
  end)

This will only print

"this string will be sent"

meaning the "this string will be lost" string is both leaked and is never given to the send callback.

I think we need to treat async data as a FIFO queue that we push onto in luv_async_send and then pop until it's empty in luv_async_cb, calling into Lua with the popped data each time. That is, we need to coalesce the data in the same way that Libuv coalesces calls to uv_async_send.

@SinisterRectus
Copy link
Member

Seems like a good fix, but I wonder if any users are somehow relying on the libuv behavior.

@squeek502
Copy link
Member

squeek502 commented Oct 24, 2020

Libuv doesn't handle any passing of data, so the coalesced callbacks are reasonable there (the only intention of Libuv's async is just to wake up the loop from any thread).

I guess it's possible that some Luv user is calling async_send with the same data multiple times and then expecting only 1 callback call with that data, but it seems fairly unlikely, and, with our current implementation, that will leak the data for every coalesced callback.

It's a little unfortunate that our binding does not match the Libuv code; it'd probably be better if we had split the actual data sending part into a luv-specific function (async_send_data or something).

@SinisterRectus
Copy link
Member

Ah okay. A queue makes sense then.

@FelipeLema
Copy link
Author

I don't mind changing my code at all, although I believe anyone using uv.send_async should be covered by either fixing this coalesced-call-memory-leak within luv code or by the leak being thoroughly documented (perhaps a warning?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants