Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HOLY GRAIL OF PRs UNBLOCK PARALLEL-TEST-SAFE SETTINGS!!!!!!!!!!!1 MASSIVE TEST PERFORMANCE IMPROVEMENT!!!!! #42194

Open
wants to merge 16 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
21 changes: 21 additions & 0 deletions docs/developers-guide/driver-changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,27 @@ title: Driver interface changelog
clause (equivalent of SQL `lead` and `lag` functions). This is enabled by default for `:sql` and `:sql-jdbc`-based
drivers. Other drivers should add an implementation for this clause and enable the feature flag.

- If you don't have a custom implementation of `metabase.driver/notify-database-updated` you can skip this section.

A new method, `metabase.driver/notify-database-will-be-deleted!`, has been added so that you can specify behaviors
that happen only when a Database is deleted, such as destroying connection pools. Previously, whenever a Database
was updated *or* deleted, `metabase.driver/notify-database-updated` was called; now you may differentiate the two
situations by implementing both methods separately. The default implementation of `notify-database-will-be-deleted!`
calls `notify-database-updated` for backward compatibility, so by default it will still be called on Database
deletes. The default implementation of `notify-database-updated` is a no-op, so unless you have an implementation
for it the default implementation of `notify-database-will-be-deleted! is effectively a no-op as well.

`:sql-jdbc` has an implementation, `notify-database-updated` destroys cached connection pools if connection details
change in a meaningful way, and `notify-database-will-be-deleted!` always destroys pools. If your driver is
JDBC-based, you probably didn't have a custom implementation of `notify-database-updated`, so you shouldn't need to
change anything.

We previously recommended that `notify-database-updated` always destroy connection pools or cached resources. Now
that you can differentiate updates from deletes, we now instead recommend that you implement
`notify-database-will-be-deleted!` separately and have it always delete connection pools or cached resources, and
update your implementation of `notify-database-updated` to check whether connection details have changed in a
meaningful way and only destroy resources if actually needed.

## Metabase 0.49.1

- Another driver feature has been added: `describe-fields`. If a driver opts-in to supporting this feature, The
Expand Down
84 changes: 44 additions & 40 deletions modules/drivers/presto-jdbc/src/metabase/driver/presto_jdbc.clj
Original file line number Diff line number Diff line change
Expand Up @@ -269,8 +269,12 @@
(defmethod sql.qp/datetime-diff [:presto-jdbc :second] [_driver _unit x y] (date-diff :second x y))

(defmethod driver/db-default-timezone :presto-jdbc
[driver database]
(sql-jdbc.execute/do-with-connection-with-options
[_driver _database]
;; harcoded to UTC for now. I don't know how to get the SYSTEM timezone for Presto (or if it even has one for that
;; matter) and neither does Windows Copilot. The commented-out code below returns the session timezone. Maybe if we
;; can figure out how to get the actual system timezone we can update it and use it.
"UTC"
#_(sql-jdbc.execute/do-with-connection-with-options
driver database nil
(fn [^java.sql.Connection conn]
;; TODO -- this is the session timezone, right? As opposed to the default timezone? Ick. Not sure how to get the
Expand Down Expand Up @@ -317,8 +321,8 @@
(defn- in-report-zone
"Returns a HoneySQL form to interpret the `expr` (a temporal value) in the current report time zone, via Presto's
`AT TIME ZONE` operator. See https://prestodb.io/docs/current/functions/datetime.html"
[expr]
(let [report-zone (qp.timezone/report-timezone-id-if-supported :presto-jdbc (lib.metadata/database (qp.store/metadata-provider)))
[driver expr]
(let [report-zone (qp.timezone/report-timezone-id-if-supported driver (lib.metadata/database (qp.store/metadata-provider)))
Comment on lines -321 to +326
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated but this is a big no-no because if anyone ever creates a new driver based on :presto-jdbc it will be instantly broken because their own methods for qp.timezone/report-timezone-id-if-supported would never get called. Imagine if Postgres did this, how would Redshift even work

;; if the expression itself has type info, use that, or else use a parent expression's type info if defined
type-info (h2x/type-info expr)
db-type (h2x/type-info->db-type type-info)]
Expand All @@ -339,76 +343,76 @@
expr)

(defmethod sql.qp/date [:presto-jdbc :minute]
[_driver _unit expr]
[:date_trunc (h2x/literal :minute) (in-report-zone expr)])
[driver _unit expr]
[:date_trunc (h2x/literal :minute) (in-report-zone driver expr)])

(defmethod sql.qp/date [:presto-jdbc :minute-of-hour]
[_driver _unit expr]
[:minute (in-report-zone expr)])
[driver _unit expr]
[:minute (in-report-zone driver expr)])

(defmethod sql.qp/date [:presto-jdbc :hour]
[_driver _unit expr]
[:date_trunc (h2x/literal :hour) (in-report-zone expr)])
[driver _unit expr]
[:date_trunc (h2x/literal :hour) (in-report-zone driver expr)])

(defmethod sql.qp/date [:presto-jdbc :hour-of-day]
[_driver _unit expr]
[:hour (in-report-zone expr)])
[driver _unit expr]
[:hour (in-report-zone driver expr)])

(defmethod sql.qp/date [:presto-jdbc :day]
[_driver _unit expr]
[:date (in-report-zone expr)])
[driver _unit expr]
[:date (in-report-zone driver expr)])

(defmethod sql.qp/date [:presto-jdbc :day-of-week]
[_driver _unit expr]
(sql.qp/adjust-day-of-week :presto-jdbc [:day_of_week (in-report-zone expr)]))
[driver _unit expr]
(sql.qp/adjust-day-of-week :presto-jdbc [:day_of_week (in-report-zone driver expr)]))

(defmethod sql.qp/date [:presto-jdbc :day-of-month]
[_driver _unit expr]
[:day (in-report-zone expr)])
[driver _unit expr]
[:day (in-report-zone driver expr)])

(defmethod sql.qp/date [:presto-jdbc :day-of-year]
[_driver _unit expr]
[:day_of_year (in-report-zone expr)])
[driver _unit expr]
[:day_of_year (in-report-zone driver expr)])

(defmethod sql.qp/date [:presto-jdbc :week]
[_driver _unit expr]
[driver _unit expr]
(letfn [(truncate [x]
[:date_trunc (h2x/literal :week) x])]
(sql.qp/adjust-start-of-week :presto-jdbc truncate (in-report-zone expr))))
(sql.qp/adjust-start-of-week :presto-jdbc truncate (in-report-zone driver expr))))

(defmethod sql.qp/date [:presto-jdbc :month]
[_driver _unit expr]
[:date_trunc (h2x/literal :month) (in-report-zone expr)])
[driver _unit expr]
[:date_trunc (h2x/literal :month) (in-report-zone driver expr)])

(defmethod sql.qp/date [:presto-jdbc :month-of-year]
[_driver _unit expr]
[:month (in-report-zone expr)])
[driver _unit expr]
[:month (in-report-zone driver expr)])

(defmethod sql.qp/date [:presto-jdbc :quarter]
[_driver _unit expr]
[:date_trunc (h2x/literal :quarter) (in-report-zone expr)])
[driver _unit expr]
[:date_trunc (h2x/literal :quarter) (in-report-zone driver expr)])

(defmethod sql.qp/date [:presto-jdbc :quarter-of-year]
[_driver _unit expr]
[:quarter (in-report-zone expr)])
[driver _unit expr]
[:quarter (in-report-zone driver expr)])

(defmethod sql.qp/date [:presto-jdbc :year]
[_driver _unit expr]
[:date_trunc (h2x/literal :year) (in-report-zone expr)])
[driver _unit expr]
[:date_trunc (h2x/literal :year) (in-report-zone driver expr)])

(defmethod sql.qp/date [:presto-jdbc :year-of-era]
[_driver _unit expr]
[:year (in-report-zone expr)])
[driver _unit expr]
[:year (in-report-zone driver expr)])

(defmethod sql.qp/unix-timestamp->honeysql [:presto-jdbc :seconds]
[_driver _unit expr]
(let [report-zone (qp.timezone/report-timezone-id-if-supported :presto-jdbc (lib.metadata/database (qp.store/metadata-provider)))]
[driver _unit expr]
(let [report-zone (qp.timezone/report-timezone-id-if-supported driver (lib.metadata/database (qp.store/metadata-provider)))]
[:from_unixtime expr (h2x/literal (or report-zone "UTC"))]))

(defmethod sql.qp/unix-timestamp->honeysql [:presto-jdbc :milliseconds]
[_driver _unit expr]
[driver _unit expr]
;; from_unixtime doesn't support milliseconds directly, but we can add them back in
(let [report-zone (qp.timezone/report-timezone-id-if-supported :presto-jdbc (lib.metadata/database (qp.store/metadata-provider)))
(let [report-zone (qp.timezone/report-timezone-id-if-supported driver (lib.metadata/database (qp.store/metadata-provider)))
millis [::mod expr [:inline 1000]]
expr [:from_unixtime [:/ expr [:inline 1000]] (h2x/literal (or report-zone "UTC"))]]
(date-add :millisecond millis expr)))
Expand Down Expand Up @@ -717,12 +721,12 @@
(sql.params.substitution/make-stmt-subs "from_iso8601_timestamp(?)" [ts-str]))

(defmethod sql.params.substitution/->prepared-substitution [:presto-jdbc ZonedDateTime]
[_ ^ZonedDateTime t]
[driver ^ZonedDateTime t]
;; for native query parameter substitution, in order to not conflict with the `PrestoConnection` session time zone
;; (which was set via report time zone), it is necessary to use the `from_iso8601_timestamp` function on the string
;; representation of the `ZonedDateTime` instance, but converted to the report time zone
#_(date-time->substitution (.format (t/offset-date-time (t/local-date-time t) (t/zone-offset 0)) DateTimeFormatter/ISO_OFFSET_DATE_TIME))
(let [report-zone (qp.timezone/report-timezone-id-if-supported :presto-jdbc (lib.metadata/database (qp.store/metadata-provider)))
(let [report-zone (qp.timezone/report-timezone-id-if-supported driver (lib.metadata/database (qp.store/metadata-provider)))
^ZonedDateTime ts (if (str/blank? report-zone) t (t/with-zone-same-instant t (t/zone-id report-zone)))]
;; the `from_iso8601_timestamp` only accepts timestamps with an offset (not a zone ID), so only format with offset
(date-time->substitution (.format ts DateTimeFormatter/ISO_OFFSET_DATE_TIME))))
Expand Down
4 changes: 0 additions & 4 deletions modules/drivers/redshift/src/metabase/driver/redshift.clj
Original file line number Diff line number Diff line change
Expand Up @@ -458,10 +458,6 @@
" */ "
(qp.util/default-query->remark query)))

(defmethod sql-jdbc.execute/set-parameter [:redshift java.time.ZonedDateTime]
[driver ps i t]
(sql-jdbc.execute/set-parameter driver ps i (t/sql-timestamp (t/with-zone-same-instant t (t/zone-id "UTC")))))
Comment on lines -461 to -463
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is actually unused as of a few days ago but I forgot to remove it


(defmethod driver/upload-type->database-type :redshift
[_driver upload-type]
(case upload-type
Expand Down
24 changes: 18 additions & 6 deletions modules/drivers/snowflake/src/metabase/driver/snowflake.clj
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,8 @@
:connection-impersonation-requires-role true
:convert-timezone true
:datetime-diff true
:now true}]
:now true
:set-timezone true}]
(defmethod driver/database-supports? [:snowflake feature] [_driver _feature _db] supported?))

(defmethod driver/humanize-connection-error-message :snowflake
Expand Down Expand Up @@ -149,10 +150,7 @@
;; other SESSION parameters
;; not 100% sure why we need to do this but if we don't set the connection to UTC our report timezone
;; stuff doesn't work, even though we ultimately override this when we set the session timezone
:timezone "UTC"
;; tell Snowflake to use the same start of week that we have set for the
;; [[metabase.public-settings/start-of-week]] Setting.
:week_start (start-of-week-setting->snowflake-offset)}
:timezone "UTC"}
(-> details
;; original version of the Snowflake driver incorrectly used `dbname` in the details fields instead
;; of `db`. If we run across `dbname`, correct our behavior
Expand Down Expand Up @@ -613,13 +611,27 @@

(defmethod sql.qp/current-datetime-honeysql-form :snowflake [_] :%current_timestamp)

(defmethod sql-jdbc.execute/set-timezone-sql :snowflake [_] "ALTER SESSION SET TIMEZONE = %s;")
(defmethod sql-jdbc.execute/do-with-connection-with-options :snowflake
[driver db-or-id-or-spec {:keys [session-timezone], :as options} f]
(let [parent-method (get-method sql-jdbc.execute/do-with-connection-with-options :sql-jdbc)]
(parent-method
driver
db-or-id-or-spec
options
(fn [^Connection conn]
(let [session-timezone (or session-timezone "UTC")]
(with-open [stmt (.createStatement conn)]
(.addBatch stmt (format "ALTER SESSION SET TIMEZONE = '%s';" session-timezone))
(.addBatch stmt (format "ALTER SESSION SET WEEK_START = %d;" (start-of-week-setting->snowflake-offset)))
(.executeBatch stmt)))
(f conn)))))

(defmethod driver/db-default-timezone :snowflake
[driver database]
(sql-jdbc.execute/do-with-connection-with-options
driver database nil
(fn [^java.sql.Connection conn]
;; is this actually the SYSTEM default timezone? Not SESSION? I thought Snowflake's default was US/Pacific.
(with-open [stmt (.prepareStatement conn "show parameters like 'TIMEZONE' in user;")
rset (.executeQuery stmt)]
(when (.next rset)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -577,10 +577,10 @@
(mt/defdataset dst-change
[["dst_tz_test" [{:field-name "dtz"
:base-type :type/DateTimeWithTZ}]
[["2023-09-30 23:59:59 +1000"] ; September
["2023-10-01 00:00:00 +1000"] ; October before DST starts
["2023-10-01 05:00:00 +1100"] ; October after DST starts
["2023-11-01 05:00:00 +1100"]]]]) ; November
[["2023-09-30 23:59:59 +1000"] ; September
["2023-10-01 00:00:00 +1000"] ; October before DST starts
["2023-10-01 05:00:00 +1100"] ; October after DST starts
["2023-11-01 05:00:00 +1100"]]]]) ; November

(deftest date-bucketing-test
(testing "#37065"
Expand All @@ -593,8 +593,8 @@
(lib/breakout dtz)
(lib/aggregate (lib/count)))]
(mt/with-native-query-testing-context query
(is (= [["2023-09-30T00:00:00+10:00" 1]
["2023-10-01T00:00:00+10:00" 2]
["2023-11-01T00:00:00+11:00" 1]]
(mt/with-temporary-setting-values [report-timezone "Australia/Sydney"]
(mt/with-temporary-setting-values [report-timezone "Australia/Sydney"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mt/with-temporary-setting-values around the is gives you more meaningful testing context when the test fails.

(is (= [["2023-09-30T00:00:00+10:00" 1]
["2023-10-01T00:00:00+10:00" 2]
["2023-11-01T00:00:00+11:00" 1]]
(mt/rows (qp/process-query query)))))))))))
24 changes: 20 additions & 4 deletions src/metabase/driver.clj
Original file line number Diff line number Diff line change
Expand Up @@ -751,16 +751,32 @@
;; TODO -- shouldn't this be called `notify-database-updated!`, since the expectation is that it is done for side
;; effects? issue: https://github.com/metabase/metabase/issues/39367
(defmulti notify-database-updated
"Notify the driver that the attributes of a `database` have changed, or that `database was deleted. This is
specifically relevant in the event that the driver was doing some caching or connection pooling; the driver should
release ALL related resources when this is called."
"Notify the driver that the attributes of a `database` have changed. This is specifically relevant in the event that
the driver was doing some caching or connection pooling; the driver should check whether connection details have
changed in a relevant way and purge connection pools if needed."
{:added "0.32.0" :arglists '([driver database])}
dispatch-on-initialized-driver
:hierarchy #'hierarchy)

(defmethod notify-database-updated ::driver [_ _]
(defmethod notify-database-updated ::driver
[_driver _database]
nil) ; no-op

(defmulti notify-database-will-be-deleted!
"Notify the driver that `database` is about to be deleted permanently. The driver should release any cached resources
or connection pools associated with the Database.

Prior to 0.50.0, [[notify-database-updated]] was called whenever a Database was updated *or* deleted. Thus the
default implementation of this calls [[notify-database-updated]] for backward compatibility. You can implement this
method separately if you want different behavior in the two situations."
{:added "0.50.0", :arglists '([driver database])}
dispatch-on-initialized-driver
:hierarchy #'hierarchy)

(defmethod notify-database-will-be-deleted! ::driver
[driver database]
(notify-database-updated driver database))

(defmulti sync-in-context
"Drivers may provide this function if they need to do special setup before a sync operation such as
`sync-database!`. The sync operation itself is encapsulated as the lambda `f`, which must be called with no arguments.
Expand Down
8 changes: 7 additions & 1 deletion src/metabase/driver/common.clj
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@
(:require
[clojure.string :as str]
[metabase.driver :as driver]
[metabase.lib.metadata.protocols :as lib.metadata.protocols]
[metabase.models.setting :as setting]
[metabase.public-settings :as public-settings]
[metabase.query-processor.store :as qp.store]
[metabase.util.i18n :refer [deferred-tru]]
[metabase.util.log :as log]
[metabase.util.malli :as mu]))
Expand Down Expand Up @@ -306,7 +308,11 @@
`0` (`:monday`) to `6` (`:sunday`). This is guaranteed to return a value."
{:added "0.42.0"}
[]
(.indexOf days-of-week (or *start-of-week* (setting/get-value-of-type :keyword :start-of-week))))
(.indexOf days-of-week
(or *start-of-week*
(when (qp.store/initialized?)
(lib.metadata.protocols/setting (qp.store/metadata-provider) :start-of-week))
(setting/get-value-of-type :keyword :start-of-week))))

(defn start-of-week-offset-for-day
"Like [[start-of-week-offset]] but takes a `start-of-week` keyword like `:sunday` rather than ` driver`. Returns the
Expand Down
6 changes: 5 additions & 1 deletion src/metabase/driver/sql_jdbc.clj
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,11 @@
(sql-jdbc.execute/execute-reducible-query driver query context respond))

(defmethod driver/notify-database-updated :sql-jdbc
[_ database]
[_driver database]
(sql-jdbc.conn/invalidate-pool-for-db-if-needed! database))

(defmethod driver/notify-database-will-be-deleted! :sql-jdbc
[_driver database]
(sql-jdbc.conn/invalidate-pool-for-db! database))

(defmethod driver/dbms-version :sql-jdbc
Expand Down
12 changes: 12 additions & 0 deletions src/metabase/driver/sql_jdbc/connection.clj
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,18 @@ For setting the maximum, see [MB_APPLICATION_DB_MAX_CONNECTION_POOL_SIZE](#mb_ap
[database]
(set-pool! (u/the-id database) nil nil))

(defn invalidate-pool-for-db-if-needed!
"Invalidate the connection pool for the given database if the connection details have changed."
[database]
(when-let [old-hash (get @database-id->jdbc-spec-hash (u/the-id database))]
(let [new-hash (jdbc-spec-hash database)]
(log/tracef "Maybe invalidating connection pool for %s... old hash: %s, new hash: %s"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it should log "Invalidating" / "Not invalidating" based on the when condition.

(pr-str (select-keys database [:id :name :engine]))
(pr-str old-hash)
(pr-str new-hash))
(when-not (= old-hash new-hash)
(invalidate-pool-for-db! database)))))

(defn- log-ssh-tunnel-reconnect-msg! [db-id]
(log/warn (u/format-color :red "ssh tunnel for database %s looks closed; marking pool invalid to reopen it" db-id))
nil)
Expand Down
6 changes: 3 additions & 3 deletions src/metabase/driver/sql_jdbc/execute.clj
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@
;; default method for Postgres not covered by any [driver jdbc-type] methods
(defmethod read-column-thunk :postgres
...)"
{:added "0.35.0", :arglists '([driver ^java.sql.ResultSet rs ^java.sql.ResultSetMetaData rsmeta i])}
{:added "0.35.0", :arglists '([driver ^java.sql.ResultSet rs ^java.sql.ResultSetMetaData rsmeta ^Long i])}
(fn [driver _rs ^ResultSetMetaData rsmeta ^Long col-idx]
[(driver/dispatch-on-initialized-driver driver) (.getColumnType rsmeta col-idx)])
:hierarchy #'driver/hierarchy)
Expand Down Expand Up @@ -571,14 +571,14 @@
(execute-prepared-statement! driver st))))

(defmethod read-column-thunk :default
[driver ^ResultSet rs rsmeta ^long i]
[driver ^ResultSet rs rsmeta ^Long i]
(let [driver-default-method (get-method read-column-thunk driver)]
(if-not (= driver-default-method (get-method read-column-thunk :default))
^{:name (format "(read-column-thunk %s)" driver)} (driver-default-method driver rs rsmeta i)
^{:name (format "(.getObject rs %d)" i)} (fn []
(.getObject rs i)))))

(defn- get-object-of-class-thunk [^ResultSet rs, ^long i, ^Class klass]
(defn- get-object-of-class-thunk [^ResultSet rs ^Long i ^Class klass]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also unrelated but ^long is wrong here since I'm 99% sure we're passing around boxed Longs rather than raw 64-bit integers

^{:name (format "(.getObject rs %d %s)" i (.getCanonicalName klass))}
(fn []
(.getObject rs i klass)))
Expand Down