connection retry refactor, add defaults with exponential backoff #207

nathaniel-may · 2022-07-20T18:03:56Z

resolves #205

Refactors retry logic to use retry_connection from core.
Preserves existing behavior,
Adds consistent defaults with other adpaters:
- if connect_retries and connect_timeout are not specified: 1 retry after 1s
- if connect_retries is specified but connect_timeout is not, it will use exponential backoff

This is a non-trivial change so I'll wait for sign off from @jtcohen6 before bothering to add the tests.

TODO

~~add similar tests from postgres PR~~ <- those are testing the library function in core so we're covered here too.

…connection from core

jtcohen6 · 2022-07-21T15:30:40Z

dbt/adapters/snowflake/connections.py

@@ -59,8 +59,8 @@ class SnowflakeCredentials(Credentials):
 proxy_host: Optional[str] = None
 proxy_port: Optional[int] = None
 protocol: Optional[str] = None
- connect_retries: int = 0
- connect_timeout: int = 10
+ connect_retries: int = 1


👍

Also discussed: using connect_retries as the standard name everywhere (including Postgres/Redshift)

dbt/adapters/snowflake/connections.py

jtcohen6 · 2022-07-21T15:34:23Z

dbt/adapters/snowflake/connections.py

+ elif creds.retry_on_database_errors:
+ retryable_exceptions = [snowflake.connector.errors.DatabaseError]


IMO let's use retry_on_database_errors: True as our default. Users can optionally turn if off, or they can opt into all errors via retry_all.

(@nathaniel-may may also be inspired to dig further into Snowflake docs)

I found more sensible defaults by looking at the connector source.

github-actions · 2022-08-04T19:09:01Z

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the dbt-core contributing guide, and the dbt-snowflake contriubuting guide.

jeremyyeo · 2022-08-05T01:42:04Z

Hey @nathaniel-may whilst you folks are working on this one... does this sync up the connect_timeout config of dbt-snowflake with all the likes of redshift/postgress adapter connect_timeout?

We recently put in a default of connect_timeout: 30 in dbt Cloud's snowflake profile (https://dbtlabs.atlassian.net/browse/SHP-588) since this config is unavailable to end users. When looking at some of the logs once this was deployed, we realized that this doesn't behave the same way as the other adapter's config (and then I double checked the docs¹ which does clearly state the behaviour here). As someone familiar with other adapters connect_timeout config, we automatically thought this config would behave the exact samesies.

This config for Snowflake's behaviour is:

If a connection terminates, wait connect_timeout seconds before retrying the connection.

As opposed to Redshift/PG's:

While connecting, wait connect_timeout seconds before dropping the connection.

https://docs.getdbt.com/reference/warehouse-profiles/snowflake-profile#retry_on_database_errors ↩

nathaniel-may · 2022-08-05T18:36:17Z

Hey @jeremyyeo! This is part of a larger effort to keep timeout config behavior to roughly the same standard:

The goal:

default to one retry after 1s for retryable errors
if the user configures a number of retries, use exponential backoff for retryable errors

Exceptions:

Snowflake already had additional retry logic so this implementation is backwards compatible for users who have that already configured. To be specific- if the user has connect_timeout configured, retries will use that instead of exponential backoff.
Postgres and Redshift will attempt to retry some non-retryable errors as well due to a psycopg2 limitation.
Spark retries will remain unchanged and continue to attempt to retry ALL errors without exponential backoff. This is due to a spark driver limitation that doesn't allow us to easily reuse the library function for retries in core.
BQ will remain unconfigurable. It will use its own exponential backoff strategy for only retryable errors as implemented in the gcp @Retry decorator. The behavior is roughly equivalent to configuring connect_retires = 6 on our other adapters because the default for bq is to stop retrying after a total of 120s but it will actually retry 10 times.

jtcohen6

LGTM! Excited to get this merged. Sorry for the delay with final review

jtcohen6 · 2022-08-19T13:57:38Z

dbt/adapters/snowflake/connections.py

+ # these two options are for backwards compatibility
+ if creds.retry_all:
+ retryable_exceptions = [Error]
+ elif creds.retry_on_database_errors:
+ retryable_exceptions.insert(0, DatabaseError)


leahwicz · 2022-08-29T23:01:40Z

@nathaniel-may can we merge this in for the end of the sprint and call it done?

add exponential backoff to connection retries. refactor to use retry_…

57dd006

…connection from core

cla-bot bot added the cla:yes label Jul 20, 2022

nathaniel-may requested a review from jtcohen6 July 20, 2022 18:41

nathaniel-may self-assigned this Jul 20, 2022

nathaniel-may marked this pull request as draft July 20, 2022 18:42

McKnight-42 self-requested a review July 20, 2022 21:37

jtcohen6 reviewed Jul 21, 2022

View reviewed changes

Nathaniel May and others added 2 commits July 21, 2022 13:56

more sensible defaults

f6624ff

Merge branch 'main' into nate/connection-retries

0897669

nathaniel-may marked this pull request as ready for review August 4, 2022 19:08

changelog

cafdee4

nathaniel-may requested a review from jtcohen6 August 4, 2022 19:14

jtcohen6 approved these changes Aug 19, 2022

View reviewed changes

Merge branch 'main' into nate/connection-retries

504b996

nathaniel-may merged commit fdb5c40 into main Aug 30, 2022

nathaniel-may deleted the nate/connection-retries branch August 30, 2022 11:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

connection retry refactor, add defaults with exponential backoff #207

connection retry refactor, add defaults with exponential backoff #207

nathaniel-may commented Jul 20, 2022 •

edited

Loading

jtcohen6 Jul 21, 2022

jtcohen6 Jul 21, 2022

nathaniel-may Jul 21, 2022

github-actions bot commented Aug 4, 2022

jeremyyeo commented Aug 5, 2022 •

edited

Loading

nathaniel-may commented Aug 5, 2022 •

edited

Loading

jtcohen6 left a comment

jtcohen6 Aug 19, 2022

leahwicz commented Aug 29, 2022

		elif creds.retry_on_database_errors:
		retryable_exceptions = [snowflake.connector.errors.DatabaseError]

connection retry refactor, add defaults with exponential backoff #207

connection retry refactor, add defaults with exponential backoff #207

Conversation

nathaniel-may commented Jul 20, 2022 • edited Loading

TODO

jtcohen6 Jul 21, 2022

Choose a reason for hiding this comment

jtcohen6 Jul 21, 2022

Choose a reason for hiding this comment

nathaniel-may Jul 21, 2022

Choose a reason for hiding this comment

github-actions bot commented Aug 4, 2022

jeremyyeo commented Aug 5, 2022 • edited Loading

Footnotes

nathaniel-may commented Aug 5, 2022 • edited Loading

jtcohen6 left a comment

Choose a reason for hiding this comment

jtcohen6 Aug 19, 2022

Choose a reason for hiding this comment

leahwicz commented Aug 29, 2022

nathaniel-may commented Jul 20, 2022 •

edited

Loading

jeremyyeo commented Aug 5, 2022 •

edited

Loading

nathaniel-may commented Aug 5, 2022 •

edited

Loading