[Bug][Flink-table-sink] Why add unique key to the pk set when generating sink table schema #181

itinycheng · 2022-04-19T11:53:10Z

The flink-tidb-connector doesn't work as expected when I use upsert mode to sink data to TiDB, the reason for this is as in title, flink version: 1.13.x. Is there any reason for making unique key as part of the primary?

Code block:

private String[] getKeyFields(Context context, ReadableConfig config, String databaseName,
      String tableName) {
    // check write mode
    TiDBWriteMode writeMode = TiDBWriteMode.fromString(config.get(WRITE_MODE));
    String[] keyFields = null;
    if (writeMode == TiDBWriteMode.UPSERT) {
      try (ClientSession clientSession = ClientSession.create(
          new ClientConfig(context.getCatalogTable().toProperties()))) {
        Set<String> set = ImmutableSet.<String>builder()
            .addAll(clientSession.getUniqueKeyColumns(databaseName, tableName)) // Why add all unique keys to pk set?
            .addAll(clientSession.getPrimaryKeyColumns(databaseName, tableName))
            .build();
        keyFields = set.size() == 0 ? null : set.toArray(new String[0]);
      } catch (Exception e) {
        throw new IllegalStateException(e);
      }
    }
    return keyFields;
  }

The issue:
flink-tidb-connector use official jdbc connector flink-jdbc-connector to complete data wiriting to tidb;
In upsert mode, the records in the buffer of flink-jdbc-connector are deduplicated by the primary key, and execute executeBatch to flush data out is disordered(because type of buffer is HashMap), refer to: TableBufferReducedStatementExecutor.java;
These may cause multiple records with the same primary key in the same batch and write TiDB out of order.

The text was updated successfully, but these errors were encountered:

xuanyu66 · 2022-05-16T08:40:45Z

@itinycheng Sorry to miss this issue.
You are right, it's better to use the primary key as keyFields.
And we can use SQL hint if someone needs to custom keyFields.

itinycheng · 2022-05-24T04:02:35Z

OK, got it.

xuanyu66 added type/enhancement New feature or request good first issue Good for newcomers labels May 16, 2022

itinycheng closed this as completed May 24, 2022

xuanyu66 reopened this May 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug][Flink-table-sink] Why add unique key to the pk set when generating sink table schema #181

[Bug][Flink-table-sink] Why add unique key to the pk set when generating sink table schema #181

itinycheng commented Apr 19, 2022

xuanyu66 commented May 16, 2022 •

edited

itinycheng commented May 24, 2022

[Bug][Flink-table-sink] Why add unique key to the pk set when generating sink table schema #181

[Bug][Flink-table-sink] Why add unique key to the pk set when generating sink table schema #181

Comments

itinycheng commented Apr 19, 2022

xuanyu66 commented May 16, 2022 • edited

itinycheng commented May 24, 2022

xuanyu66 commented May 16, 2022 •

edited