Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SqlReader Retry Logic #233

Open
sharpjs opened this issue Jun 3, 2018 · 4 comments
Open

SqlReader Retry Logic #233

sharpjs opened this issue Jun 3, 2018 · 4 comments

Comments

@sharpjs
Copy link

sharpjs commented Jun 3, 2018

When using a PaaS database product like Azure SQL Database, occasional transient errors occur more frequently than with a traditional on-premises server. The usual solution is to add retry logic.

Good sources on what errors are retryable are in here and here. If you would like a ready-made SqlConnection wrapper that implements retries, I can probably arrange to get you the one my client uses — I wrote it. DM me on twitter @sharpjs to discuss further.

Here is an example of one such error that came through my client's monitoring today:

System.Data.SqlClient.SqlException: Database '*****' on server '*****.database.windows.net' is not currently available. Please retry the connection later. If the problem persists, contact customer support, and provide them the session tracing ID of '*****'.
   at System.Data.SqlClient.SqlInternalConnectionTds..ctor (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.SqlConnectionFactory.CreateConnection (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.ProviderBase.DbConnectionFactory.CreatePooledConnection (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.ProviderBase.DbConnectionPool.CreateObject (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.ProviderBase.DbConnectionPool.UserCreateRequest (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.ProviderBase.DbConnectionPool.TryGetConnection (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.ProviderBase.DbConnectionPool.TryGetConnection (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.SqlConnection.TryOpenInner (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.SqlConnection.TryOpen (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Data.SqlClient.SqlConnection.Open (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at ImageResizer.Plugins.SqlReader.SqlReaderPlugin.RowExists (ImageResizer.Plugins.SqlReader, Version=3.4.3.103, Culture=neutral, PublicKeyToken=null)
   at ImageResizer.Configuration.PipelineConfig.GetFile (ImageResizer, Version=3.4.3.103, Culture=neutral, PublicKeyToken=null)
   at ImageResizer.InterceptModule.CheckRequest_PostAuthorizeRequest (ImageResizer, Version=3.4.3.103, Culture=neutral, PublicKeyToken=null)
   at System.Web.HttpApplication+SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute (System.Web, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a)
   at System.Web.HttpApplication.ExecuteStepImpl (System.Web, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a)
   at System.Web.HttpApplication.ExecuteStep (System.Web, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a)
@lilith
Copy link
Member

lilith commented Jun 5, 2018

Are you sure that is desirable?

Retries can take down a server, and lengthen transient failures.

For example, imagine a server temporarily stops responding. Without retries, a few images fail to appear, but no additional load is placed on the server except end-users hitting refresh.

With retries, a backlog of retries is added to the request queue, which makes it more likely that the server will continue failing requests rather than make progress on shortening the queue and reducing the response time.

With subscription quotas, this problem gets even worse.

There's not that much fetch code: https://github.com/imazen/resizer/blob/develop/Plugins/SqlReader/SqlReader.cs#L145-L218

It seems like it would be trivial to implement. But I think I would suggest such a feature be off by default.

@sharpjs
Copy link
Author

sharpjs commented Jul 1, 2018

Yes, retry of transient errors is desirable and recommended for Azure SQL Database.

As this request specifically targets documented[1] transient errors, which typically resolve quickly, impact upon services need not be severe. Impact can be lessened further by exponential backoff, and even further by a cooperative retry policy — stop retrying when a heuristic detects that it isn't helping.

Off-by-default is a safe choice. Some configurability of the retry policy would be nice as well:

  • Initial wait before retry
  • Maximum wait before retry
  • Maximum number of retries before responding with failure
  • Maximum number of failure responses before abandoning retries

[1] See code links in original post, and also this MS article.

@lilith
Copy link
Member

lilith commented Jul 2, 2018

Okay. Keep in mind browsers have their own request timeouts.

I'd be open to a pull request for this if the logic is clear and has tests.

@sharpjs
Copy link
Author

sharpjs commented Jul 9, 2018

I might be able to put together a PR, though not quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants