Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySqlBulkCopy should match columns by name #1375

Open
askids opened this issue Sep 27, 2023 · 6 comments
Open

MySqlBulkCopy should match columns by name #1375

askids opened this issue Sep 27, 2023 · 6 comments

Comments

@askids
Copy link

askids commented Sep 27, 2023

Enhancement request (see comment): fall back to matching columns by name instead of by ordinal.

Software versions
MySqlConnector version: 2.27 and 2.3.0-beta3
Server type (MySQL, MariaDB, Aurora, etc.) and version: Singlestore 7.8
.NET version: 6.0

Describe the bug
I am using MySqlBulkCopy class to load data from a datatable (created from DB2 source) to Singlestore. The source and target table have same number of columns (~180) and data type definition. Out of the 180 column, only 4 columns have NOT NULL definition. Rest all columns are defined as nullable. When I try to load the table with only a subset of columns (mandatory columns + few nullable columns), I am getting error referencing the column which is not even included in my datatable used for loading. I get the error "Data too long for column 'COLABC'", when column COLABC is not even in my datatable. Moreover COLABC is a nullable column. So not sure, from where its getting this invalid value from. I am not using any column mapping as data table column names matches with target. To generate the datatable definition, I run a dummy query SELECT * FROM MYTABLE WHERE 1 = 0, and from datareader, get the schema table and build the target table definition. Then I remove any columns not in my actual query. So not sure, from where does this unused column come up in the error.

If I include few more nullable columns and 1 date column, I see Invalid DATE/TIME exception, but the error message doesn't say which column has invalid date time. There is only 1 DATE column among the subset of columns included. So when I set breakpoint, I see that the date column has DBNULL.Value assigned to it which should be valid value as column is defined as nullable.

So currently, I am not able to figure out a) why I get an error referencing column not in my datatable used for bulkcopy. b) if there is indeed issue with invalid date/time, it would atleast help to know which column has invalid date/time. Eventually, we have some 30 columns defined with either DATE or TIMESTAMP(6) definition among this 180 columns. So it will be hard to figure out which column has invalid date.

Right now, I have restricted data to just 1 row in datatable and still see this error. I have allow local file load to true on connection string. Please let me know if any further information is needed.

Exception

{"@t":"2023-09-27T20:43:42.6754899Z","@mt":"ErrorMethod: {@ErrorMethod}, ErrorMessage: {@ErrorMessage}","@l":"Error","@x":"MySqlConnector.MySqlException (0x80004005): Data too long for column 'COLABC'
   at MySqlConnector.Core.ServerSession.ReceiveReplyAsyncAwaited(ValueTask`1 task) in /_/src/MySqlConnector/Core/ServerSession.cs:line 954
   at MySqlConnector.Core.ResultSet.ReadResultSetHeaderAsync(IOBehavior ioBehavior) in /_/src/MySqlConnector/Core/ResultSet.cs:line 42
   at MySqlConnector.MySqlDataReader.ActivateResultSet(CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlDataReader.cs:line 130
   at MySqlConnector.MySqlDataReader.InitAsync(CommandListPosition commandListPosition, ICommandPayloadCreator payloadCreator, IDictionary`2 cachedProcedures, IMySqlCommand command, CommandBehavior behavior, Activity activity, IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlDataReader.cs:line 486
   at MySqlConnector.Core.CommandExecutor.ExecuteReaderAsync(IReadOnlyList`1 commands, ICommandPayloadCreator payloadCreator, CommandBehavior behavior, Activity activity, IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/Core/CommandExecutor.cs:line 56
   at MySqlConnector.MySqlCommand.ExecuteNonQueryAsync(IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlCommand.cs:line 300
   at MySqlConnector.MySqlBulkLoader.LoadAsync(IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlBulkLoader.cs:line 212
   at MySqlConnector.MySqlBulkCopy.WriteToServerAsync(IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlBulkCopy.cs:line 337
   at MySqlConnector.MySqlBulkCopy.WriteToServerAsync(DataTable dataTable, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlBulkCopy.cs:line 131
"@t":"2023-09-26T22:57:08.8358120Z","@mt":"ErrorMethod: {@ErrorMethod}, ErrorMessage: {@ErrorMessage}","@l":"Error","@x":"MySqlConnector.MySqlException (0x80004005): Invalid DATE/TIME in type conversion
   at MySqlConnector.Core.ServerSession.ReceiveReplyAsyncAwaited(ValueTask`1 task) in /_/src/MySqlConnector/Core/ServerSession.cs:line 954
   at MySqlConnector.Core.ResultSet.ReadResultSetHeaderAsync(IOBehavior ioBehavior) in /_/src/MySqlConnector/Core/ResultSet.cs:line 42
   at MySqlConnector.MySqlDataReader.ActivateResultSet(CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlDataReader.cs:line 130
   at MySqlConnector.MySqlDataReader.InitAsync(CommandListPosition commandListPosition, ICommandPayloadCreator payloadCreator, IDictionary`2 cachedProcedures, IMySqlCommand command, CommandBehavior behavior, Activity activity, IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlDataReader.cs:line 486
   at MySqlConnector.Core.CommandExecutor.ExecuteReaderAsync(IReadOnlyList`1 commands, ICommandPayloadCreator payloadCreator, CommandBehavior behavior, Activity activity, IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/Core/CommandExecutor.cs:line 56
   at MySqlConnector.MySqlCommand.ExecuteNonQueryAsync(IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlCommand.cs:line 300
   at MySqlConnector.MySqlBulkLoader.LoadAsync(IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlBulkLoader.cs:line 212
   at MySqlConnector.MySqlBulkCopy.WriteToServerAsync(IOBehavior ioBehavior, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlBulkCopy.cs:line 337
   at MySqlConnector.MySqlBulkCopy.WriteToServerAsync(DataTable dataTable, CancellationToken cancellationToken) in /_/src/MySqlConnector/MySqlBulkCopy.cs:line 131

Expected behavior
Invalid Date/Time exception should say which column has invalid data and show the invalid content.

Additional context
I also tried using Singlestore's forked version of MySqlConnector. Ran into same problem (invalid date) with that package as well.

Thanks!

@bgrainger
Copy link
Member

Can you grab a decrypted packet capture of the MySqlBulkCopy command and subsequent error?

@askids
Copy link
Author

askids commented Sep 28, 2023

Can you grab a decrypted packet capture of the MySqlBulkCopy command and subsequent error?

Can you please provide the process to do that?

BTW, I have an update. I was able to resolve these issues after adding the bulkcopy column mapping. So looks like the issue was due to connector incorrectly deriving the auto column mapping (may be because of the assumptions it makes about column order).

It makes sense now that error was referring to some non-existent column in the datatable as it was using column ordinal/name from target table definition rather than the using column ordinal/name from the data table. So when I added the column mapping based on the built datatable, it resolved the issue,

May be a separate enhancement can be requested so that connector does not assume that column order in datatable used for bulkcopy is same as target table. This feature is available in other RDBMS connectors. So it will be good to have this feature in MySql Connector as well.

Thanks!

@askids
Copy link
Author

askids commented Sep 28, 2023

Also, the invalid date/time error message should definitely be enhanced to say which column has the issue.

@bgrainger
Copy link
Member

The need to use a column mapping for your scenario is in the documentation:

If the columns being copied from the data source line up one-to-one with the columns in the destination table then populating this collection is unnecessary. Otherwise, this should be filled with a collection of MySqlBulkCopyColumnMapping objects specifying how source columns are to be mapped onto destination columns.

@bgrainger
Copy link
Member

May be a separate enhancement can be requested so that connector does not assume that column order in datatable used for bulkcopy is same as target table.

Are you asking it to fall back to matching columns by name? Something else?

This feature is available in other RDBMS connectors.

Which ones? Links to documentation would be helpful (to assess how these other connectors work).

@askids
Copy link
Author

askids commented Sep 28, 2023

Are you asking it to fall back to matching columns by name? Something else?

Yes. Fallback to using column ordinal and column name from the provided data table to create the column mapping. May be current behavior can be retained as default behavior. But give a config option which tells the connector to get the column mapping from the input datatable, instead of assuming that datatable has all columns and in the same order as target table.

I simply looped through the datacolumn collection of input datatable and created an extension method to add to the column mapping collection.

@bgrainger bgrainger changed the title MySqlBulkCopy throws unrelated errors MySqlBulkCopy should match columns by name Nov 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants