Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DB connection pool helper #23

Open
dselivanov opened this issue Oct 26, 2018 · 8 comments
Open

Add DB connection pool helper #23

dselivanov opened this issue Oct 26, 2018 · 8 comments
Assignees
Labels
docs low low priority

Comments

@dselivanov
Copy link
Collaborator

dselivanov commented Oct 26, 2018

Needs IPC communication between Rserve child processes. Related to s-u/Rserve#105

@artemklevtsov
Copy link
Collaborator

Simply add note to the docs and close it :)

@dselivanov
Copy link
Collaborator Author

dselivanov commented Aug 7, 2019

There are 2 options:

  1. open and close connection for each request
  2. have a pool of connections initialized at start and re-use them

This issue is about 2

@dselivanov dselivanov added the low low priority label Sep 12, 2019
@long-do
Copy link

long-do commented Aug 20, 2020

Hi @dselivanov,
I share my R script in the issue #25 comment in which I manage a pool connection before I create a new app.
It is important to create a pool connection before creating a new app, if not it could create multiple pools (pb encountered in my case, perhaps my code is not good).
Could you share us how one must manage the pool connection in RestRserve API method? Thank you.

@dselivanov
Copy link
Collaborator Author

dselivanov commented Aug 23, 2020

@long-do I'm think pool pkg won't work correctly as pool object is created in the parent R session. Child R sessions can use connections which are initialized in the parent, however they don't share pool object as it is a copy of the parent pool object on each child process. Which means they can't collaboratively use connections and it is highly probable that several processes might use then same connection. Which will lead to the corruption of the connection.

The easiest way to tackle this is to use RestRserve behind proxy (such as HAproxy) and let proxy manage persistent connections to the RestRserve. This way child session will hold the state till the time there is connection between proxy and RestRserve. In order to re-use DB connections you might need to structure your code in a way that it tries to perform DB query and if it fails you request handler re-initialise it.

library(RestRserve)

con = NULL

reconnect = function() {
  message("reconnecting")
  con <<- create_connection()
  invisible(TRUE)
}

app = Application$new()
app$add_get("/hello", function(req, res) {
  tryCatch(do_something_with_db(req), 
           error = function() {
             # re-init connections
             reconnect()
             # re-do your staff
             do_something_with_db(req)
           })
})

But keep in mind that if your app will create new connection for each request (this is by default for all new clients) then db connection will be open for each requests and it might cause significant overhead.

@long-do
Copy link

long-do commented Aug 26, 2020

Hi @dselivanov,
Thank you for your reply.
However, it is working very well in my case because in my API invoked function I declared
response$body <- foo::bar(input = request$body, pool = pool_conn)
and in my package R foo I use
fetched_conn <- pool::poolCheckout(pool)
so the connection pool is well managed and steady (time to fetch a new connection from the pool is about 0.001 - 0.002s). Note that I authorize 10 connections as min size of the pool.
Of course in case of connection pooling lost I handle a new connection by tryCatch.

@vikram-rawat
Copy link

Just a friendly request. Please add a vignette for working with Databases. Almost all of the R user work with databases.

I used RestRserve for a project and was not able to use pool because it leaked every few seconds. I had to create a new connection every request which had a huge overhead.

with pool -- 15 - 30 ms
with conn -- 50 - 70 ms

This overhead and the fact that I had to use a HaProxy to keep all the connection open throughout allowed my IT team to reject the entire architecture because plumber was performing better and was more scalable without proxy server.

There are many people like me who will not be able to make it work in a production environment. Please write a page for best practises working with RestRserve as well.

I know it is time consuming to write documentation but it's very essential for new users and advanced alike. Even if you don't have a solution for it. It's good to document it. In future when you have one you can update the document.

Just wanted to share my experience with the project. Hope you will understand my point of view.

It's a nice project. Keep up the good work. 👍

@dereckmezquita
Copy link

dereckmezquita commented Oct 28, 2023

I'm doing my research. Has this issue of database connections across threads been solved in RestRserve?

I'm looking at using RestRserve but would need it to allow for a way to do db connections across threads/worker processes without too much overhead.

I like the fact that RestRserve handles each request on a separate fork automatically.

I was looking at using plumber, but I would have to use future or promises to allow for a non-blocking REST API - AND this doesn't allow for sharing database connections across processes.

Also, I don't like the comment style syntax plumber uses, and appreciate RestRserve's more classic look; looks like node express to me :)

I came across valve and faucet (seem to do the same thing; they both build on plumber) - which claim to run plumber multithreaded and solve the database connection issue across threads.

However, if RestRserve allows for pooled connections or a way to do it I would be interested in using RestRserve.

@vikram-rawat and @long-do I think you both seem to know what you're talking about any input is appreciated :)

@dselivanov
Copy link
Collaborator Author

@dereckmezquita It's fairly complex to make database connection pool work with Rserve's fork mechanism. It's recommended to open and close connection each time you process request. Yes, there is an overhead, but again I the effort to even develop "workaround" solution (not covering all edge cases) is not trivial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs low low priority
Projects
None yet
Development

No branches or pull requests

5 participants