pandas to_sql is slow with fast_executemany when SQL trace/extended events is running #1215
-
Posting this here since I think it might be relevant (please close this if not). We use pandas But we've started encountering an issue where But I can reproduce with a simple trace without SQL Sentry.
Why would a trace have such a huge effect (>100x slower)? Because of the large number of I'll talk to the DBAs about what events they're capturing on prod and if they can reduce the overhead. It's an all-day monitoring suite. I also see some different results when I have trace running and also use chunksize argument in to_sql.
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
This really sounds like something you should be asking the SQL Sentry people about, but one thing you might try would be to use this https://gist.github.com/gordthompson/1fb0f1c3f5edbf6192e596de8350f205 (to avoid |
Beta Was this translation helpful? Give feedback.
-
That isn't really surprising - tracing always has overhead and it's often very very large, especially if it's logging all the data too. Even ODBC trace will cause applications to become many times slower. |
Beta Was this translation helpful? Give feedback.
This really sounds like something you should be asking the SQL Sentry people about, but one thing you might try would be to use this
.to_sql()
methodhttps://gist.github.com/gordthompson/1fb0f1c3f5edbf6192e596de8350f205
(to avoid
.to_sql()
using.executemany()
) and see if it makes any difference.