feat: clear session pool on connection close #543

IlyaFaer · 2020-10-23T09:01:26Z

To avoid keeping Cloud Spanner sessions alive after connection close we should clear the related pool on the connection close.

google/cloud/spanner_dbapi/connection.py

c24t

I agree with @olavloite in https://github.com/googleapis/python-spanner-django/pull/543/files#r510984476 that this shouldn't clear the session pool if it doesn't own it.

FWIW in python-spanner Database doesn't expose the session pool or give users an option to clear it (unless they created the pool themselves). Like #535 (comment) this might be an argument for changes to python-spanner instead of this library.

IlyaFaer · 2020-10-27T08:25:21Z

I still don't see how one can use single pool for several databases. Pool is binded to a concrete database, if we'll use the same pool for connections to different databases, it's not gonna be correct. Thus, for me it looks like users should not use global pools at all, even for several connections to the same database, as we're not very able to control whether the pool is used only for single database.

olavloite · 2020-10-27T09:25:25Z

I still don't see how one can use single pool for several databases. Pool is binded to a concrete database, if we'll use the same pool for connections to different databases, it's not gonna be correct. Thus, for me it looks like users should not use global pools at all, even for several connections to the same database, as we're not very able to control whether the pool is used only for single database.

Yes, that is correct, but I'm not sure that is what @c24t meant. So to summarize:

A session pool is as @IlyaFaer noted always linked with one database. It is not possible to have a single session pool for multiple different databases.
The current implementation of the DB API requires the user to supply a database instance when a connection is created. That means that the connection is not the owner of the database instance, and the connection should therefore not close or clear the session pool when the connection is closed. That is an action that the client application should take. We should add this to the documentation so it's clear for users what we expect from them.
Users should not create a new database instance for each connection they want to open. Instead, connections for the same database should use the same database instance. This will effectively allow multiple connections to share the same session pool. This should also be documented.
Assuming that we will be supporting SQLAlchemy in the future: SQLAlchemy expects users to create connections by specifying a connection string in a very similar way as connections are created in for example JDBC or ADO.NET. Once we support that, we will need to implement some kind of 'database pool' to ensure that when a user creates multiple connections to the same database, we re-use the database instance that we have already created for a previous connection for the same database.

My comment on using something like a global session pool was aimed at that last point from the list above, where 'global' means 'global per database'. So if a user creates multiple connections to different databases, we also need multiple session pools (one for each unique database).

The current solution of requiring the user to supply a database instance when creating a connection is as far as I can tell fine for the current use case (Django support).

IlyaFaer · 2020-10-30T11:55:24Z

@olavloite, our main way to create a connection is actually connect function, and it works with string ids of instance and database. However, one can send a session pool into this function. If user not setting pool, it'll be created by the original Spanner Database object here.

Thus, in fact user can use global pools, but looking at how Database works with the given pool, I'm starting to think that the original Spanner client doesn't consider using global pools - it'll bind the given pool to the database without checking if this pool is already binded to another database. I mean, in the original client user can propagate one pool to several different databases, it'll be incorrect, but we don't think about it (it looks like we don't). So, for me, it seems like Database intended to use its own pool. Still, one can pass the concrete pool to set the right type of a pool - not to implement one global pool for several Database objects. PTAL at the docs - they also describe only different types of pools, not practices about using global ones. That's why I consider closing pool by the connection as a right way of things. WDYT?

As the original client doesn't care about closing a binded pool of the database, I don't think this PR is very major. If we don't have a good and simple way to close the pool, I'm okay to just close the PR.

olavloite · 2020-10-30T12:53:59Z

@IlyaFaer I would say that it would normally be a good practice to let the owner of the pool decide when a pool is closed. So that would mean:

If the client application passes in a pool to a database or a connection, then the client application is the owner of the pool and should close it when appropriate. If we were to let the connection close such a session pool, we might cause unexpected errors if the same session pool is used for multiple connections.
If the client application does not pass in a pool to a database or a connection, then the database/connection creates its own session pool. That session pool is owned by the database/connection, and should be closed by the database/connection whenever the database/connection is closed. The drawback of this solution for connections is that a connection will never really need more than one session, so it's quite an overkill to create a separate session pool for each connection.

A global session pool that can be used for different backend databases is as you say not possible, as a session pool is always bound to a specific database on the backend. What I meant with a global session pool is a global session pool per database. That would basically mean a pool of session pools with the database-id as the key for getting a session pool from this 'session pool pool'. That is something that would be preferable in the end for the SQLAlchemy implementation.

I had a closer look at the current database and sessionpool implementations in the Spanner Python client library, and they do a lot less than I had expected... For now I would say that we should do the following:

When a user creates a connection and thereby specifies a pool that should be used (i.e. the client application passes in an actual session pool): When closing the connection we should do nothing.
When a user creates a connection without a specific pool: Create a new session pool and use that to create the database instance. When the connection is closed, we should also close the session pool that was created when the connection was created.

IlyaFaer · 2020-11-02T08:10:31Z

@olavloite, the thing which I don't like in our case is that we're initiating Database() object by connect() function, and then we're passing it into Connection() object. So, Connection() object getting Database() object with initiated pool, and it doesn't know how this database got the pool - created its own or used the one passed into constructor. We could expose Connection() object constructor argument or a class property to let Connection() object know about pool, but it looks not very beautiful for me.

What if we'll expose a clear_pool arg in Connection.close() method, which will be True by default? If user wants to use global pool and control its close by himself, he'll have to call close like this: connection.close(clear_pool=False).

olavloite · 2020-11-02T11:15:54Z

@olavloite, the thing which I don't like in our case is that we're initiating Database() object by connect() function, and then we're passing it into Connection() object. So, Connection() object getting Database() object with initiated pool, and it doesn't know how this database got the pool - created its own or used the one passed into constructor. We could expose Connection() object constructor argument or a class property to let Connection() object know about pool, but it looks not very beautiful for me.

What if we'll expose a clear_pool arg in Connection.close() method, which will be True by default? If user wants to use global pool and control its close by himself, he'll have to call close like this: connection.close(clear_pool=False).

I don't think we should add a parameter to the Connection.close() function for that. I don't think it would be clear for most users what that would mean. I would prefer a solution where the decision is automatically made by the Connection based on what was passed in to the connect() method. Can't we add a private field to the Connection class that indicates whether it owns the pool or not, and use that to determine whether to automatically close the pool when the connection is closed? That field can be set by the connect() method when the connection is created.

IlyaFaer · 2020-11-03T08:36:56Z

Can't we add a private field to the Connection class that indicates whether it owns the pool or not, and use that to determine whether to automatically close the pool when the connection is closed? That field can be set by the connect() method when the connection is created.

Of course we can. Can't say I like it, but seems like there is no ideal solution in here, so I'm pushing a protected property.

IlyaFaer · 2020-11-03T08:38:36Z

tests/unit/spanner_dbapi/test_connection.py

-        from google.cloud.spanner_dbapi import Connection
-
-        connection = Connection(self.INSTANCE, self.DATABASE)
+        connection = self._make_connection()


This test has an error: self.DATABASE is actually a string, not a Database object, so passing it into Connection object constructor will cause errors.

olavloite · 2020-11-04T10:44:22Z

Can't we add a private field to the Connection class that indicates whether it owns the pool or not, and use that to determine whether to automatically close the pool when the connection is closed? That field can be set by the connect() method when the connection is created.

Of course we can. Can't say I like it, but seems like there is no ideal solution in here, so I'm pushing a protected property.

Thank you. I think we should look some more into this later if we start adding support for SqlAlchemy. From what I understand from the documentation on that, a connection is created by SqlAlchemy itself based on a connection string, which means that we probably need to do some more management of databases and session pools.

olavloite

The change itself looks good to me.

Can we do anything with the build failures, or are those related to the general Kokoro / test problems?

IlyaFaer · 2020-11-04T11:58:38Z

Can we do anything with the build failures, or are those related to the general Kokoro / test problems?

Yes, system tests on Kokoro are broken because of parallelization. I'm checking all other sessions, but system tests for now only can be run succesfully on emulator (separate GitHub Workflow check).

IlyaFaer · 2020-11-04T11:59:37Z

@c24t, we've processed this PR. Do you have any final comments?

feat: clear session pool on connection close

8351df9

IlyaFaer added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. api: spanner Issues related to the googleapis/python-spanner-django API. labels Oct 23, 2020

google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Oct 23, 2020

IlyaFaer requested a review from c24t October 23, 2020 09:02

IlyaFaer marked this pull request as ready for review October 23, 2020 09:03

IlyaFaer requested a review from a team as a code owner October 23, 2020 09:03

olavloite reviewed Oct 23, 2020

View reviewed changes

google/cloud/spanner_dbapi/connection.py Outdated Show resolved Hide resolved

olavloite requested changes Oct 23, 2020

View reviewed changes

google/cloud/spanner_dbapi/connection.py Outdated Show resolved Hide resolved

c24t reviewed Oct 26, 2020

View reviewed changes

Merge branch 'master' into clear_pool

ae5522d

use protected property

9b1ca23

IlyaFaer commented Nov 3, 2020

View reviewed changes

IlyaFaer added 2 commits November 3, 2020 11:45

fix tests for Python 3.5

05d0fd5

fix tests for Python 3.5

8e9180e

olavloite approved these changes Nov 4, 2020

View reviewed changes

c24t approved these changes Nov 4, 2020

View reviewed changes

c24t merged commit 14e4cac into googleapis:master Nov 4, 2020

IlyaFaer deleted the clear_pool branch November 5, 2020 07:38

release-please bot mentioned this pull request Apr 17, 2024

chore(main): release 3.0.0 #869

Closed

This was referenced Apr 18, 2024

chore(main): release 3.0.0 #875

Closed

chore(main): release 3.0.0 #877

Closed

chore(main): release 3.1.0 #878

Closed

chore(main): release 3.1.0 #881

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: clear session pool on connection close #543

feat: clear session pool on connection close #543

IlyaFaer commented Oct 23, 2020 •

edited

c24t left a comment

IlyaFaer commented Oct 27, 2020

olavloite commented Oct 27, 2020 •

edited

IlyaFaer commented Oct 30, 2020 •

edited

olavloite commented Oct 30, 2020

IlyaFaer commented Nov 2, 2020

olavloite commented Nov 2, 2020

IlyaFaer commented Nov 3, 2020

IlyaFaer Nov 3, 2020

olavloite commented Nov 4, 2020

olavloite left a comment

IlyaFaer commented Nov 4, 2020

IlyaFaer commented Nov 4, 2020

feat: clear session pool on connection close #543

feat: clear session pool on connection close #543

Conversation

IlyaFaer commented Oct 23, 2020 • edited

c24t left a comment

Choose a reason for hiding this comment

IlyaFaer commented Oct 27, 2020

olavloite commented Oct 27, 2020 • edited

IlyaFaer commented Oct 30, 2020 • edited

olavloite commented Oct 30, 2020

IlyaFaer commented Nov 2, 2020

olavloite commented Nov 2, 2020

IlyaFaer commented Nov 3, 2020

IlyaFaer Nov 3, 2020

Choose a reason for hiding this comment

olavloite commented Nov 4, 2020

olavloite left a comment

Choose a reason for hiding this comment

IlyaFaer commented Nov 4, 2020

IlyaFaer commented Nov 4, 2020

IlyaFaer commented Oct 23, 2020 •

edited

olavloite commented Oct 27, 2020 •

edited

IlyaFaer commented Oct 30, 2020 •

edited