Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best practice #461

Closed
pjebs opened this issue Jun 3, 2016 · 41 comments
Closed

Best practice #461

pjebs opened this issue Jun 3, 2016 · 41 comments

Comments

@pjebs
Copy link

pjebs commented Jun 3, 2016

With this driver, is it best practice to every HTTP request open the database using Open("mysql","...") and close it using Close() at the end of every requst cycle

OR

just open it once and never close it irrespective of the request cycle?

@benguild
Copy link

benguild commented Jun 3, 2016

It says in the code comment(s):

It is rare to Close a DB, as the DB handle is meant to be long-lived and shared between many goroutines.

... So I would guess no. Worst case they timeout or get closed automatically based on the configuration for maximum allowed idle connections?

@methane
Copy link
Member

methane commented Jun 3, 2016

See SetConnMaxLifetime

@arnehormann
Copy link
Member

Also, see http://go-database-sql.org/accessing.html (the part after the numbered list)

@pjebs
Copy link
Author

pjebs commented Jun 19, 2016

@benguild how did you configure SetConnMaxLifetime, SetMaxIdleConns and SetMaxOpenConns

Any idea where we can get best practice principles on setting them for different load conditions?

@pjebs
Copy link
Author

pjebs commented Jun 19, 2016

For more than a year, I've used this driver by using Open() at start of Request and Close() at end of request life cycle. It has worked perfectly.

Upon hearing this advice, I now open 1 connection, leave the reference in a global variable and keep using it. I never close it.

I haven't set SetConnMaxLifetime, SetMaxIdleConns and SetMaxOpenConns. They are left at default.

This technique simple doesn't work for me. The number of active connections to CloudSQL keeps increasing until it hits the Google App Engine connection limit. After that, I can't connect to database.

The connections are not getting closed.

@dgryski
Copy link

dgryski commented Jun 19, 2016

The Go sample for CloudSQL calls Open() only once at the start: https://github.com/GoogleCloudPlatform/golang-samples/blob/master/getting-started/bookshelf/db_mysql.go

I wonder if this issue is specific to CloudSQL and not tied to the driver itself?

Maybe @broady has some insight?

@pjebs
Copy link
Author

pjebs commented Jun 19, 2016

Well I'm now not closing. That's when the issues started.

@broady
Copy link

broady commented Jun 19, 2016

I wrote that sample code with the understanding that the connection would be long lived. I have actually never used SQL from Go in a production app.

Is it disconnecting because of a timeout (therefore we need a periodic keep alive ping) or because the socket is broken?

Who should handle this? The driver or the user?

@broady
Copy link

broady commented Jun 19, 2016

Actually, what I said doesn't many any sense. The sql package should handle bad connections in its pool. Did we limit this issue to this driver or is it a bug in database/sql?

@benguild
Copy link

benguild commented Jun 19, 2016

@pjebs I'm calling those right after it opens the connection.
To be honest, it seems like if you don't set those parameters (SetMaxOpenConns) it would just make as many connections as it wanted. Have you tried setting SetConnMaxLifetime to like 5 or 10 minutes to see if it closes the connections then?

Try calling these as soon as your connection is open:

db.DB().SetConnMaxLifetime(time.Minute*5);
db.DB().SetMaxIdleConns(0);
db.DB().SetMaxOpenConns(5);

See if it exceeds 5 connections ever, and if any of the connections idle past 5 minutes. Let me know. I would think that it would hang if there weren't any idle connections in the pool available, but I haven't read the code so it might just fail in that case.

@dgryski
Copy link

dgryski commented Jun 19, 2016

I wonder if it's a problem with the application code. If the handlers aren't closing the result sets then the driver will have to open more and more connections because the existing ones are still in use. When the connection was opened and closed for each request the connections held open by in-progress queries are terminated so they never pile up.

That would explain why the problem appeared with the switch to the single global persistent connection.

@benguild
Copy link

Also, @pjebs, check to make sure (obviously) that you're not somehow creating an additional database pool entirely by accident somewhere in your code. If there's a new pool being allocated somewhere that could slowly cause the issue. If you were calling defer db.close() before then something like this could go undetected, but to me it sounds like maybe there's an instance where an entirely new pool is being created/opened based on your previous code being in a single use sort of style.

The open/close with each request method would work too as long as your database connection latency isn't that bad and the pool is successfully deallocated afterward, but I think in most settings using the pool is better as long as two separate applications aren't (for example) keeping lots of connections open for a long period of time and could maybe block the other from allocating some even if those are idle.

@benguild
Copy link

@dgryski Yeah, I was just thinking that as well (see above). During the switch from a single use to a retained pool there could be an instance where another pool was being created somewhere in the code and never being deallocated or something, maybe.

@pjebs
Copy link
Author

pjebs commented Jun 23, 2016

Another issue that is related:

With my previous method, for debug purposes I use to do this (to avoid a foreign key constraint error):

db.Exec("SET foreign_key_checks = 0;")
_, err := db.Exec("INSERT IGNORE INTO " + c.Q_VALENTINES_PIVOT_CUSTOMER_EVENT + " (customer_id, event_id, gender, allocated_group, session_1, session_2, public_name, profile_photo, registered_time) VALUES " + values + ";")
    if err != nil {
        db.Exec("SET foreign_key_checks = 1;")
        return err
    }
    db.Exec("SET foreign_key_checks = 1;")

it use to ALWAYS work.

Now that I am using 'connection pooling', this never works.
#208: This explains why. It says that there is no guarantee that I'll be using the same connection and since the SET is only set for the session, it doesn't necessarily apply for the main statement.

I understand that. But, how come it worked in my prevous method. Although I always created a new connection and closed it at end of request cycle, the 'connection' I create would still utilise connection pooling?

@dgryski
Copy link

dgryski commented Jun 23, 2016

@pjebs the connection pool may open multiple connections. If you only need a single connection at a time then you'll stay with one. In your old case you never had two concurrent database requests on the same pool so only one connection was ever made. Thus, all Execs happened on the same connection. Once you moved to proper pools this is no longer the case.

This would be easier to debug if we could see more of your code.

@benguild
Copy link

benguild commented Jun 23, 2016

@pjebs you need to do that as a database transaction to ensure it hits the same connection: http://jinzhu.me/gorm/advanced.html#transactions

Although I'm unclear on why it'd be doing what you speak of, it's not completely out of line for it to use different connections for each statement's request. That's sort of the idea behind the "pool" ... given that it doesn't necessarily know what is making the request and there could be many requests operating simultaneously. It'd be kind of dumb for a single connection to be reserved for a single request for its entire lifetime, but yeah. Use a transaction.

It would be totally out of line for it to swap connections within a transaction. What you're doing should really be within a transaction anyway, as changing configuration parameters could affect other queries from other parts of your code or other requests. The way that's currently written... if your code panics, that would never get changed back on that particular connection! On that note, also be sure that you defer your transaction rolling back if it never finishes (if there's a crash/panic, for example) ... because otherwise it might hang that database connection or another process could adopt that database connection and start making changes to that transaction that'll never be committed, etc. — https://github.com/jinzhu/gorm/issues/1056

@pjebs
Copy link
Author

pjebs commented Jun 23, 2016

I noticed in @broady 's code: https://github.com/GoogleCloudPlatform/golang-samples/blob/master/getting-started/bookshelf/db_mysql.go#L82
after he opens a connection, he immediately calls Ping() to test if the connection is still responsive.
If it isn't, he just quits. Is there anything else we can do? Should we try and open again?

In the https://godoc.org/github.com/garyburd/redigo/redis#Pool library, they also implement a nice pool.
It has a nice function called TestOnBorrow where in the example slightly below they also do a Ping().

Is there something similar in this driver?

@benguild
Copy link

@pjebs what is the actual current issue?

@pjebs
Copy link
Author

pjebs commented Jun 23, 2016

Just asking about best practice on when to implement Ping() and what to do if Ping fails

@benguild
Copy link

Oh, I see. As far as I can tell you shouldn't need to...? But, if there are dead connections still in the pool... it seems necessary.

@pjebs
Copy link
Author

pjebs commented Jun 23, 2016

You definitely need to Ping. They do it in redigo (above link) and in @broady 's code.

@methane
Copy link
Member

methane commented Jun 23, 2016

db.Exec("SELECT 1+1") can be used as PING.

@dgryski
Copy link

dgryski commented Jun 23, 2016

@benguild
Copy link

We "definitely" need to Ping? That makes no sense... what happens if the connection isn't valid? Abort, or retry?

@benguild
Copy link

Why wouldn't running a query call the equivalent of .Ping automatically?

@benguild
Copy link

db.Exec("SELECT 1+1") can be used as PING.

OK, so then why do we definitely need to .Ping?

@broady
Copy link

broady commented Jun 23, 2016

Ping validates that the connection and auth is valid. I think it's better to fail upon establishing the pool rather than on the first request.

I suppose you could retry, but what is the failure condition you're working around? A bad network connection?

@methane
Copy link
Member

methane commented Jun 23, 2016

@benguild db.Ping() doesn't use this library.
How about going go-nuts or go forum?

@pjebs
Copy link
Author

pjebs commented Jun 23, 2016

Yes, occasionally one of the returned pooled connections is bad. Currently I call ping and if ping fails, I explicitly close and then open and retry query (not sure if that's what you are meant to do with this library's pooling facilities given the documentation essentially says NEVER CLOSE.

@benguild
Copy link

benguild commented Jun 24, 2016

@broady, OK that makes more sense... I didn't think it'd keep in [the pool] or release dead connections from the pool to the caller. @pjebs FYI, I know that we're both using GORM, and it actually does this at the end of gorm.Open():

if err == nil {
    err = db.DB().Ping() // Send a ping to make sure the database connection is alive.
}

@benguild
Copy link

benguild commented Jun 24, 2016

OK, so considering this particular library attempts .Ping but on its own otherwise does not retry in the event of a failure (it simply returns the error) .... Might something like this be necessary for a reasonable level of consistency with this driver?

for databaseConnectionAttemptLoop := 0; databaseConnectionAttemptLoop<4; databaseConnectionAttemptLoop++ {
    err=nil; // NOTE: If there is an error, the loop will continue and replace the value for "err" with each attempt. This will continue until the maximum number of attempts have been made... at which point the loop simply exits and the error is returned at the end of the function.

    if db, err = gorm.Open(/* database connection string... NOTE: This function attempts `.Ping` on its own. */))); err == nil && db != nil {
        /* configure connection */

    }
    ////

    if (err==nil) {
        break; // Here, if there is no error, it simply breaks out and does not retry again.

    }

}

Not currently sure what is expected if bad connections can intentionally be returned.

@pjebs
Copy link
Author

pjebs commented Jun 29, 2016

Currently If Ping fails (usually due to a bad connection), I just do an infinite loop on trying to connect again with a slight time.sleep() in between each attempt. I know that Google App Engine will destroy the request after their 60 second request watchdog timer expires so there is no danger of a real infinite loop.

@benguild
Copy link

Hmm, I don't know if I'd do that—
The time.sleep() sounds good but I would try a maximum of 5 times or so. If the box is legitimately down or overloaded, you could crash your web-server or end up with a huge bill for usage if there are a lot of hanging requests for 60 seconds.

@pjebs
Copy link
Author

pjebs commented Jun 29, 2016

That's what I'm currently doing for testing. It won't be done in production. I'll put a limit on how many attempts it does.
Thankfully, I've never seen cloud sql down for more than a minute.

@dgryski
Copy link

dgryski commented Jun 29, 2016

Google's SRE book suggests maximum of 3 attempts with randomized exponential back off, with a global per-process retry budget of 10%. If you start failing more than 10% of your calls, stop retrying.

@pjebs
Copy link
Author

pjebs commented Jun 29, 2016

Oh wow. Thanks @dgryski . Where can I find SRE book?

@pjebs
Copy link
Author

pjebs commented Jun 29, 2016

https://books.google.com.au/books?id=81UrjwEACAAJ&redir_esc=y
Looks like I have to buy it.

@pjebs
Copy link
Author

pjebs commented Sep 20, 2016

@dgryski What doyou mean by "global per-process retry budget"?

@maiiz
Copy link

maiiz commented May 16, 2018

@ benguild

db.DB().SetConnMaxLifetime(time.Minute*5);
db.DB().SetMaxIdleConns(0);
db.DB().SetMaxOpenConns(5);

use this snippet code, with multiple goruntime, i got error blow

(dial tcp 127.0.0.1:3306: connect: can't assign requested address)

see https://github.com/jinzhu/gorm/issues/1898

Try calling these as soon as my connection is open:

db.DB().SetConnMaxLifetime(time.Minute*5);
db.DB().SetMaxIdleConns(5);
db.DB().SetMaxOpenConns(5);

solve it.

@amirhj
Copy link

amirhj commented Dec 5, 2019

For more than a year, I've used this driver by using Open() at start of Request and Close() at end of request life cycle. It has worked perfectly.

Upon hearing this advice, I now open 1 connection, leave the reference in a global variable and keep using it. I never close it.

I haven't set SetConnMaxLifetime, SetMaxIdleConns and SetMaxOpenConns. They are left at default.

This technique simple doesn't work for me. The number of active connections to CloudSQL keeps increasing until it hits the Google App Engine connection limit. After that, I can't connect to database.

The connections are not getting closed.

I have same problem with connecting to proxysql. I have set the following configuration for a production node with high load:

max_idle: 1
max_open: 100
max_life_time: 5s
write_timeout: 5s
read_timeout: 3s
connection_timeout: 10s

But when I check the number of connections for db port it shows more than 21k connections in TIME_WAIT state.

@yedamao
Copy link

yedamao commented Jan 16, 2024

For more than a year, I've used this driver by using Open() at start of Request and Close() at end of request life cycle. It has worked perfectly.
Upon hearing this advice, I now open 1 connection, leave the reference in a global variable and keep using it. I never close it.
I haven't set SetConnMaxLifetime, SetMaxIdleConns and SetMaxOpenConns. They are left at default.
This technique simple doesn't work for me. The number of active connections to CloudSQL keeps increasing until it hits the Google App Engine connection limit. After that, I can't connect to database.
The connections are not getting closed.

I have same problem with connecting to proxysql. I have set the following configuration for a production node with high load:

max_idle: 1
max_open: 100
max_life_time: 5s
write_timeout: 5s
read_timeout: 3s
connection_timeout: 10s

But when I check the number of connections for db port it shows more than 21k connections in TIME_WAIT state.

I encountered the same problem, there are more than 20k connections in the TIME_WAIT state. The root cause is that the MaxIdleConns is less than MaxOpenConns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants