You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to batch-insert data into the database, but I never achieve rates faster than ~2k rows per second. How can I improve insertion performance?
I broke it down to a test database that I created like this:
execute(conn, NANODBC_TEXT("create unlogged table rowset_iteration (i int, d float);"));
Then I created batch_size integers and floats in a std::vector and then I insert them like this:
As batch_sizes I used values like 500 and more but it doesn't make much difference.
I observed the network traffic with wireshark and realised that each and every row is being sent synchronously and once at a time... that seems to be the problem here.. I thought that batch-insert would send the data at once and not individually? Any ideas?
Expected behavior
Minimal Working Example
The text was updated successfully, but these errors were encountered:
From what I can see it looks like you're doing everything correctly here. Using the transaction is required for batch operations, and upping the batch size should set the SQL_ATTR_PARAMSET_SIZE attribute on the StatementHandle using SQLSetStmtAttr (). Note that this requires ODBC 3.0 or newer if that matters. It's not clear to me why you are seeing this problem.
Thank you for confirming my approach! That's really helpful.
Unfortunately I didn't get it to work. I use a PostgreSQL database and the latest postgresodbc driver in Win10. I suppose they use the required version...
However, I was able to achieve what I wanted by create a multi-value insert string in a for loop. Not the most beautiful solution, but it works... I ended up with something like this:
statement stmt(conn);
std::stringstream query;
query <<"insert into rowset_iteration (i, d) values ";
for (int i = 0; i < batch_size; i++) {
query << "(" << integers[i] << "," << floats[i] << ")";
if (i != batch_size - 1)
query << ",";
else
query << ";";
}
prepare(stmt, query.str());
transact(stmt,1);
with batch_size = 1e6 I achieve 65Mb in one go which in turn runs up to 100k inserts/sec... good enough for me.
If someone has any idea why the original approach didn't work, I'd be still interested ...
Environment
Actual behavior
I am trying to batch-insert data into the database, but I never achieve rates faster than ~2k rows per second. How can I improve insertion performance?
I broke it down to a test database that I created like this:
Then I created
batch_size
integers and floats in astd::vector
and then I insert them like this:As
batch_sizes
I used values like 500 and more but it doesn't make much difference.I observed the network traffic with wireshark and realised that each and every row is being sent synchronously and once at a time... that seems to be the problem here.. I thought that batch-insert would send the data at once and not individually? Any ideas?
Expected behavior
Minimal Working Example
The text was updated successfully, but these errors were encountered: