Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spanner: Session Pool Configuration #1683

Closed
LiHaoTan opened this issue Nov 26, 2019 · 9 comments
Closed

spanner: Session Pool Configuration #1683

LiHaoTan opened this issue Nov 26, 2019 · 9 comments
Assignees
Labels
api: spanner Issues related to the Spanner API. type: question Request for information or clarification. Not an issue.

Comments

@LiHaoTan
Copy link

LiHaoTan commented Nov 26, 2019

I am trying to understand when it is required to manually configure the session pool.

I read the session docs and it appears to say that only creators of client libraries need to manage sessions manually. However, one way of interpreting that means that I should still set the session pool config appropriately, and the client library will manage the sessions according to what I want to achieve.

In the recent changelog, I see that spanner.NewClient is created with a minimum of 100 opened sessions by default now.

Am I right to say that in many cases the default config will work well, or do I need to use my own config for better performance?

I am also not sure but it seems I used to see Spanner documentation on how to configure the session pool configuration, for example MaxOpened. I can't seem to find that anymore, and the godoc doesn't appear to explain that.

@olavloite olavloite added api: spanner Issues related to the Spanner API. type: question Request for information or clarification. Not an issue. labels Nov 26, 2019
@olavloite
Copy link
Contributor

Hi @LiHaoTan

You normally do not have to manually configure the session pool. In most cases the default settings will work well, and you can safely use the spanner.NewClient method.

If you do need to supply a custom session pool configuration, or other Spanner client configuration, you should create your client using the spanner.NewClientWithConfig method (see https://godoc.org/cloud.google.com/go/spanner#NewClientWithConfig). Supplying a custom configuration for for example MaxOpened is done like this:

config := ClientConfig{
	SessionPoolConfig: SessionPoolConfig{
		MaxOpened: 1000,
	},
}
formattedDatabase := fmt.Sprintf("projects/%s/instances/%s/databases/%s", "[PROJECT]", "[INSTANCE]", "[DATABASE]")
client, err := spanner.NewClientWithConfig(ctx, formattedDatabase, config)

@LiHaoTan
Copy link
Author

Thank you for your reply.

Also sorry for my unclear phrasing but with regards to configuring the session pool I was actually asking how we should tune the session pool.

For instance (a completely artificial example), assuming I am going to do 2,000 concurrent reads a second, and I am using 10 Kubernetes pods, then should I set it MinOpened to 200? And things like how many HealthCheckWorkers I need. How fast is BatchCreateSessions and related to that what should I set MaxBurst to?

I understand that I should profile my application to figure things out but just wondering if there are some general guidelines.

@olavloite
Copy link
Contributor

Thanks for the clarification. As always with these things, the answer depends on the circumstances.

The most important general rule of thumb is: Set MinOpened to at least the number of concurrent transactions that your client will be executing (and you should round up and not down). If you are able to estimate a good value for that, the default values for the other session pool settings are good.

Furthermore, the following applies:

  1. The session pool should contain at least as many sessions as that your application will execute concurrent transactions, as one session can have at most one active transaction at a time. If in your example you do 2,000 concurrent reads a second, each of these reads has its own transaction, and you are executing your application on 10 Kubernetes pods, then setting MinOpened=200 is a good choice.
  2. MaxBurst controls the maximum number of sessions that should be created concurrently on demand. If your application normally only needs 100 sessions, but there might be sudden bursts that increases that requirement to 300 sessions, you could set MinOpened=100 and MaxBurst=200. This will ensure that the Spanner client will create up to 200 sessions concurrently if the application is requesting more sessions than are in the pool at that moment. You should only do this if these bursts are uncommon. If it is normal that your application sometimes needs 300 sessions, it is better to set MinSessions=300.
  3. MaxIdle controls the number of sessions that the session pool will keep in the pool, even though they are considered 'idle' by the pool. It's best explained using an example:
    • The configuration is MinOpened=100, MaxOpened=400, MaxIdle=10.
    • A burst of user requests causes your application to execute a higher number of transactions than normal, requiring the session pool to create additional sessions. After this burst has passed, the session pool contains 150 sessions.
    • The application usage falls back to its normal level and the application does not need more than 100 sessions for 10 minutes or longer.
    • The session pool maintainer will see that the session pool contains more than MinOpened sessions, but that these sessions are not needed. It will start to delete sessions until the session pool contains MinOpened+MaxIdle sessions. In this example the session pool maintainer will reduce the number of sessions in the pool to 110.
  4. The default values for HealthCheckWorkers and HealthCheckInterval are good for virtually all circumstances and there's no general rule of thumb when these should be changed.

BatchCreateSessions will only be used to initialize the MinOpened sessions in the pool. This RPC executes in roughly the same time as a single CreateSession RPC, meaning that the total initialization time of a session pool with 100+ sessions can normally be measured in some hundreds of milliseconds (also depending on the network latency between your client and the server).

@LiHaoTan
Copy link
Author

Thank you so much for your explanation!

@olavloite
Copy link
Contributor

Closing this issue as hopefully the above note has provided the information you needed. Please feel free to reopen if something is not clear.

@apstndb
Copy link
Contributor

apstndb commented May 19, 2021

What is discussed in this issue is the meaning of the settings as of v1.1.0.
I would like to confirm the current state.

@olavloite
Copy link
Contributor

See inline replies.

What is discussed in this issue is the meaning of the settings as of v1.1.0.
I would like to confirm the current state.

  • The meaning of MinOpened is not changed and it is the most important setting.

    • In my experience, the client like CLI or batch process, which is short living and use less concurrent sessions, are not comfortable with default value(100).

The meaning of MinOpened has not changed and is certainly one of the most important settings. For most cases, this value can be kept at the default value, or should be increased if your application is expected to execute a large number of queries / transactions in parallel. A smaller value can also be a good choice if your application is short lived, as creating 100 sessions, executing one or only a few queries, and then deleting 100 sessions is inefficient. If your application is long-lived, but never does any parallel queries and thereby never really needs more than 1 session, the default value is also OK, as the overhead of creating and deleting sessions in comparison with the total application lifetime is relatively low. But also in the last case it can make sense to decrease it to a lower value.

  • The meaning of MaxOpened is not changed and there are some need to change because default value(400) is too big or small for some clients.

The meaning of MaxOpened has not changed. There are not very many cases where the default value is too big, as this value will not have any impact on an application that uses less than MaxOpened sessions. The only scenario where it could make sense to lower it, is if you suspect that your application is experiencing a session leak, and you want to track it down more quickly. Lowering MaxOpened in combination with setting TrackSessionHandles to true, will ensure that the session pool will be exhausted more quickly, and will return the a stackdump of the at that moment checked out sessions.

Correct. This setting does not have any function anymore, and is only kept around to prevent compilation failures in existing applications.

  • The health check has changed and there are no need to change HealthCheckInterval and HealthCheckWorkers.

Correct.

Correct, and good spot on the missing change to the godoc. I'll update that.

Correct. The change from GetSession to SELECT 1 should be considered an internal implementation detail, and was mainly done to be consistent with the client libraries for other languages.

Correct, the meaning and behavior of this option has not changed. It is a setting that is normally not needed, and in most cases it's better to just use a higher MinSessions value.

  • The meaning of WriteSessions is not changed and meaningful in some situations.

    • The client library maintains write pool and currently not use inline begin transaction like java-spanner(googleapis/java-spanner#325).

    • If the ratio of ReadOnly transaction and ReadWrite transaction is extremely different, it makes sense to set it.

      • Especially, there are only ReadOnly transactions, it is better to be set 0.0.

        • In this case, the principal may not have spanner.databases.beginOrRollbackReadWriteTransaction permission.

All the above is correct. This setting is still used in the Go client library to maintain a number of write-prepared sessions in the pool, and it can be useful to tweak this value if your application has a significantly different read/write ratio than reflected in this setting.

@apstndb
Copy link
Contributor

apstndb commented May 20, 2021

Thanks for answering!

@apstndb
Copy link
Contributor

apstndb commented Jan 23, 2023

FYI for readers: v1.43.0(#7149) has introduced the inline begin transaction for ReadWriteTransactions and WriteSessions is now deprecated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner Issues related to the Spanner API. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

No branches or pull requests

3 participants