Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

同一个库下不同表进行同步的时候,如果参数SkipCreateDbTable未配置,会由于MDL锁hang住整个库 #1060

Open
dibrother opened this issue May 25, 2023 · 3 comments

Comments

@dibrother
Copy link

dibrother commented May 25, 2023

Description

同一个库下不同表进行同步的时候,如果参数SkipCreateDbTable未配置,会由于MDL锁hang住整个库

复制结构:
实例:MySQL-A -- > MySQL-A
库:yqtest -- > yqtest
表:sbtest2 --> sbtest2_new

Steps to reproduce the issue

  1. 表结构相关
mysql> show create table sbtest2\G
*************************** 1. row ***************************
       Table: sbtest2
Create Table: CREATE TABLE `sbtest2` (
  `id` int NOT NULL AUTO_INCREMENT,
  `k` int NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_2` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

mysql> show create table sbtest2_new\G
*************************** 1. row ***************************
       Table: sbtest2_new
Create Table: CREATE TABLE `sbtest2_new` (
  `id` int NOT NULL AUTO_INCREMENT,
  `k` int NOT NULL DEFAULT '0',
  `c` varchar(120) DEFAULT NULL,
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_3` (`k`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

2.DTLE迁移配置信息如下:

{
        "Job": {
                "ID": "db_sync_test",
                "Datacenters": ["dc1"],
                "TaskGroups": [{
                        "Name": "src",
                        "Tasks": [{
                                "Name": "src",
                                "Driver": "dtle",
                                "Config": {
                                        "Gtid": "",
                                        "ChunkSize": 5000,
                                        "ReplicateDoDb": [{
                                                "TableSchema": "yqtest",
                                                "Tables": [{
                                                        "TableName": "sbtest2",
                                                        "TableRename": "sbtest2_new"
                                                }]
                                        }],
                                        "SrcConnectionConfig": {
                                                "Host": "192.168.60.156",
                                                "Port": 3306,
                                                "User": "root",
                                                "Password": "pass"
                                        },
                                        "DestConnectionConfig": {
                                                "Host": "192.168.60.156",
                                                "Port": 3306,
                                                "User": "root",
                                                "Password": "pass"
                                        }
                                }
                        }]
                }, {
                        "Name": "dest",
                        "Tasks": [{
                                "Name": "dest",
                                "Driver": "dtle",
                                "Config": {
                                        "DestType": "mysql"
                                }
                        }]
                }]
        }
}

3.启动job

curl -XPOST 127.0.0.1:4646/v1/jobs -d @db_sync_test.json| jq

Describe the results you received

  • 数据未同步
  • 数据库夯住,被ID 225 执行的语句
  • ID 225 执行了语句:START TRANSACTION/START TRANSACTION WITH CONSISTENT SNAPSHOT,详细看下面详细日志
mysql> select * from information_schema.processlist where command not in ('Daemon','Binlog Dump GTID') and id != connection_id();
+-----+------+----------------------+--------+---------+------+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| ID  | USER | HOST                 | DB     | COMMAND | TIME | STATE                            | INFO                                                                                                                                                      |
+-----+------+----------------------+--------+---------+------+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| 224 | root | 192.168.60.155:53826 | NULL   | Sleep   |    3 |                                  | NULL                                                                                                                                                      |
| 225 | root | 192.168.60.155:53828 | NULL   | Sleep   |    2 |                                  | NULL                                                                                                                                                      |
| 220 | root | 192.168.60.155:53814 | NULL   | Sleep   |    3 |                                  | NULL                                                                                                                                                      |
| 197 | root | 192.168.60.156:60726 | yqtest | Sleep   |  264 |                                  | NULL                                                                                                                                                      |
| 221 | root | 192.168.60.155:53816 | NULL   | Query   |    3 | Waiting for schema metadata lock | CREATE DATABASE /*!32312 IF NOT EXISTS*/ `yqtest` /*!40100 DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci */ /*!80016 DEFAULT ENCRYPTION='N' */ |
| 222 | root | 192.168.60.155:53818 | NULL   | Sleep   |    3 |                                  | NULL                                                                                                                                                      |
| 198 | root | 192.168.60.129:60840 | yqtest | Sleep   |   39 |                                  | NULL                                                                                                                                                      |
| 199 | root | 192.168.60.129:60845 | NULL   | Sleep   | 1929 |                                  | NULL                                                                                                                                                      |
| 223 | root | 192.168.60.155:53822 | NULL   | Sleep   |    3 |                                  | NULL                                                                                                                                                      |
+-----+------+----------------------+--------+---------+------+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
9 rows in set (0.00 sec)

Output of ./dtle version:**

版本:4.23.04.2-4.23.04.x-25ab378

Additional information

MySQL general_log

2023-05-25T09:21:43.160614Z       220 Connect   root@192.168.60.155 on  using TCP/IP
2023-05-25T09:21:43.161319Z       221 Connect   root@192.168.60.155 on  using TCP/IP
2023-05-25T09:21:43.161736Z       220 Query     SELECT @@max_allowed_packet
2023-05-25T09:21:43.162356Z       220 Query     SET NAMES utf8mb4
2023-05-25T09:21:43.162640Z       221 Query     SELECT @@max_allowed_packet
2023-05-25T09:21:43.162941Z       220 Query     SET autocommit=true,time_zone='+00:00'
2023-05-25T09:21:43.163294Z       221 Query     SET NAMES utf8mb4
2023-05-25T09:21:43.163590Z       220 Query     show grants for current_user()
2023-05-25T09:21:43.164018Z       221 Query     SET autocommit=true,time_zone='+00:00'
2023-05-25T09:21:43.164308Z       220 Query     SELECT @@GTID_MODE
2023-05-25T09:21:43.164600Z       221 Query     select @@wait_timeout
2023-05-25T09:21:43.164965Z       220 Query     select @@log_bin, @@binlog_format
2023-05-25T09:21:43.165160Z       221 Query     SET @@session.foreign_key_checks = 0
2023-05-25T09:21:43.165546Z       220 Query     select @@binlog_row_image
2023-05-25T09:21:43.167205Z       222 Connect   root@192.168.60.155 on  using TCP/IP
2023-05-25T09:21:43.167822Z       223 Connect   root@192.168.60.155 on  using TCP/IP
2023-05-25T09:21:43.168093Z       222 Query     SELECT @@max_allowed_packet
2023-05-25T09:21:43.168645Z       223 Query     SELECT @@max_allowed_packet
2023-05-25T09:21:43.168880Z       222 Query     SET NAMES utf8mb4
2023-05-25T09:21:43.169184Z       223 Query     SET NAMES utf8mb4
2023-05-25T09:21:43.169412Z       222 Query     SET autocommit=true,time_zone='+00:00'
2023-05-25T09:21:43.169676Z       223 Query     SET autocommit=true,time_zone='+00:00'
2023-05-25T09:21:43.169836Z       222 Query     select @@version, @@time_zone, @@lower_case_table_names, @@net_write_timeout
2023-05-25T09:21:43.170241Z       223 Query     select @@version, @@time_zone, @@lower_case_table_names, @@net_write_timeout
2023-05-25T09:21:43.170542Z       222 Query     show grants for current_user()
2023-05-25T09:21:43.170980Z       223 Query     show variables where Variable_name IN ('character_set_server','collation_server')
2023-05-25T09:21:43.174375Z       222 Query     SELECT @@SERVER_UUID /*dtle*/
2023-05-25T09:21:43.174978Z       222 Query     CREATE DATABASE IF NOT EXISTS dtle
2023-05-25T09:21:43.177944Z       222 Query     SHOW TABLES FROM dtle LIKE 'gtid_executed_temp_%'
2023-05-25T09:21:43.178984Z       223 Query     select @@sql_mode
2023-05-25T09:21:43.180261Z       222 Query     SHOW TABLES FROM dtle LIKE 'gtid_executed_%'
2023-05-25T09:21:43.180533Z       221 Query     SET character_set_server = utf8mb4, collation_server = utf8mb4_0900_ai_ci
2023-05-25T09:21:43.181043Z       221 Query     SET @@session.sql_mode = 'ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION'
2023-05-25T09:21:43.181503Z       221 Query     set @@session.foreign_key_checks = 0 /*dtle*/
2023-05-25T09:21:43.182262Z       221 Query     SET character_set_server = utf8mb4, collation_server = utf8mb4_0900_ai_ci
2023-05-25T09:21:43.182309Z       224 Connect   root@192.168.60.155 on  using TCP/IP
2023-05-25T09:21:43.182775Z       221 Query     SET @@session.sql_mode = 'ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION'
2023-05-25T09:21:43.183364Z       224 Query     SELECT @@max_allowed_packet
2023-05-25T09:21:43.183701Z       221 Prepare   delete from dtle.gtid_executed_v4 where job_name = ? and hex(source_uuid) = ?
2023-05-25T09:21:43.183993Z       224 Query     SET NAMES utf8mb4
2023-05-25T09:21:43.184174Z       221 Prepare   replace into dtle.gtid_executed_v4 (job_name,source_uuid,gtid,gtid_set) values (?, ?, ?, null)
2023-05-25T09:21:43.184539Z       224 Query     SET autocommit=true,time_zone='+00:00',transaction_isolation='REPEATABLE-READ'
2023-05-25T09:21:43.185053Z       224 Query     show master status /*dtle*/
2023-05-25T09:21:43.185095Z       222 Prepare   SELECT source_uuid,gtid,gtid_set FROM dtle.gtid_executed_v4 where job_name=?
2023-05-25T09:21:43.185624Z       222 Execute   SELECT source_uuid,gtid,gtid_set FROM dtle.gtid_executed_v4 where job_name='db_sync_columns'
2023-05-25T09:21:43.186651Z       225 Connect   root@192.168.60.155 on  using TCP/IP
2023-05-25T09:21:43.186998Z       222 Close stmt
2023-05-25T09:21:43.187449Z       225 Query     SELECT @@max_allowed_packet
2023-05-25T09:21:43.188004Z       225 Query     SET NAMES utf8mb4
2023-05-25T09:21:43.188487Z       225 Query     SET time_zone='+00:00',transaction_isolation='REPEATABLE-READ',autocommit=true
2023-05-25T09:21:43.188950Z       225 Query     START TRANSACTION
2023-05-25T09:21:43.189416Z       225 Query     START TRANSACTION WITH CONSISTENT SNAPSHOT
2023-05-25T09:21:43.189934Z       225 Query     show master status /*dtle*/
2023-05-25T09:21:43.190480Z       225 Query     select @@wait_timeout
2023-05-25T09:21:43.191036Z       225 Query     SHOW DATABASES
2023-05-25T09:21:43.192767Z       225 Query     SHOW CREATE SCHEMA IF NOT EXISTS `yqtest`
2023-05-25T09:21:43.193303Z       225 Query     SHOW TABLES IN `yqtest`
2023-05-25T09:21:43.195004Z       220 Query     show table status from `yqtest` like 'sbtest2'
2023-05-25T09:21:43.198492Z       220 Query     show columns from `yqtest`.`sbtest2`
2023-05-25T09:21:43.202941Z       220 Prepare   SELECT UNIQUES.INDEX_NAME, UNIQUES.COLUMN_NAMES, LOCATE('auto_increment', EXTRA) > 0 as is_auto_increment, has_nullable
FROM INFORMATION_SCHEMA.COLUMNS
     INNER JOIN
     (SELECT TABLE_SCHEMA, TABLE_NAME, INDEX_NAME,
             GROUP_CONCAT(COLUMN_NAME ORDER BY SEQ_IN_INDEX ASC) AS COLUMN_NAMES,
             SUBSTRING_INDEX(GROUP_CONCAT(COLUMN_NAME ORDER BY SEQ_IN_INDEX ASC), ',', 1) AS FIRST_COLUMN_NAME,
             SUM(NULLABLE='YES') > 0 AS has_nullable
      FROM INFORMATION_SCHEMA.STATISTICS
      WHERE NON_UNIQUE=0 AND TABLE_SCHEMA = ? AND TABLE_NAME = ?
      GROUP BY TABLE_SCHEMA,TABLE_NAME,INDEX_NAME) AS UNIQUES
     ON (COLUMNS.TABLE_SCHEMA = UNIQUES.TABLE_SCHEMA
         AND COLUMNS.TABLE_NAME = UNIQUES.TABLE_NAME
         AND COLUMNS.COLUMN_NAME = UNIQUES.FIRST_COLUMN_NAME)
WHERE COLUMNS.TABLE_SCHEMA = ? AND COLUMNS.TABLE_NAME = ?
2023-05-25T09:21:43.203075Z       220 Execute   SELECT UNIQUES.INDEX_NAME, UNIQUES.COLUMN_NAMES, LOCATE('auto_increment', EXTRA) > 0 as is_auto_increment, has_nullable
FROM INFORMATION_SCHEMA.COLUMNS
     INNER JOIN
     (SELECT TABLE_SCHEMA, TABLE_NAME, INDEX_NAME,
             GROUP_CONCAT(COLUMN_NAME ORDER BY SEQ_IN_INDEX ASC) AS COLUMN_NAMES,
             SUBSTRING_INDEX(GROUP_CONCAT(COLUMN_NAME ORDER BY SEQ_IN_INDEX ASC), ',', 1) AS FIRST_COLUMN_NAME,
             SUM(NULLABLE='YES') > 0 AS has_nullable
      FROM INFORMATION_SCHEMA.STATISTICS
      WHERE NON_UNIQUE=0 AND TABLE_SCHEMA = 'yqtest' AND TABLE_NAME = 'sbtest2'
      GROUP BY TABLE_SCHEMA,TABLE_NAME,INDEX_NAME) AS UNIQUES
     ON (COLUMNS.TABLE_SCHEMA = UNIQUES.TABLE_SCHEMA
         AND COLUMNS.TABLE_NAME = UNIQUES.TABLE_NAME
         AND COLUMNS.COLUMN_NAME = UNIQUES.FIRST_COLUMN_NAME)
WHERE COLUMNS.TABLE_SCHEMA = 'yqtest' AND COLUMNS.TABLE_NAME = 'sbtest2'
2023-05-25T09:21:43.205785Z       220 Close stmt
2023-05-25T09:21:43.206602Z       220 Prepare   select
                                *
                        from
                                information_schema.columns
                        where
                                table_schema=?
                                and table_name=?
2023-05-25T09:21:43.206935Z       220 Execute   select
                                *
                        from
                                information_schema.columns
                        where
                                table_schema='yqtest'
                                and table_name='sbtest2'
2023-05-25T09:21:43.208219Z       220 Close stmt
2023-05-25T09:21:43.208451Z       225 Query     show create table `yqtest`.`sbtest2`
2023-05-25T09:21:43.211033Z       221 Query     set @@session.foreign_key_checks = 0 /*dtle*/
2023-05-25T09:21:43.211814Z       221 Query     CREATE DATABASE /*!32312 IF NOT EXISTS*/ `yqtest` /*!40100 DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci */ /*!80016 DEFAULT ENCRYPTION='N' */
2023-05-25T09:21:43.212127Z       225 Prepare   select table_rows from information_schema.tables where table_schema = ? and table_name = ?
2023-05-25T09:21:43.212444Z       225 Execute   select table_rows from information_schema.tables where table_schema = 'yqtest' and table_name = 'sbtest2'
2023-05-25T09:21:43.213252Z       225 Close stmt
2023-05-25T09:21:43.213391Z       225 Query     show create table `yqtest`.`sbtest2`
2023-05-25T09:21:43.214950Z       225 Query     SELECT * FROM `yqtest`.`sbtest2` where (true) and (true) order by `id` asc LIMIT 5000
2023-05-25T09:21:43.235875Z       225 Query     SELECT * FROM `yqtest`.`sbtest2` where (((`id` > '5000'))) and (true) order by `id` asc LIMIT 5000
2023-05-25T09:21:43.272544Z       225 Query     SELECT * FROM `yqtest`.`sbtest2` where (((`id` > '10000'))) and (true) order by `id` asc LIMIT 5000
...
2023-05-25T09:21:44.457690Z       225 Query     SELECT * FROM `yqtest`.`sbtest2` where (((`id` > '245000'))) and (true) order by `id` asc LIMIT 5000
2023-05-25T09:21:46.021098Z       165 Query     select * from information_schema.processlist where command not in ('Daemon','Binlog Dump GTID') and id != connection_id()
2023-05-25T09:21:54.476231Z       225 Query     select 1
2023-05-25T09:22:04.478720Z       225 Query     select 1
2023-05-25T09:22:14.475864Z       225 Query     select 1

关键在于开启了事务,还在执行中执行了DDL

2023-05-25T09:21:43.188950Z       225 Query     START TRANSACTION
2023-05-25T09:21:43.189416Z       225 Query     START TRANSACTION WITH CONSISTENT SNAPSHOT
2023-05-25T09:21:43.189934Z       225 Query     show master status /*dtle*/

疑问:

既然在同步初始阶段就开启了 Binlog Dump GTID,而回放机制是 replace into ,还有没有必要开启 SNAPSHOT 获取一致性读

@ghost
Copy link

ghost commented May 26, 2023

源端和目标端为同一MySQL实例时有该问题。

snapshot事务从长期来看是可以去除的。但尚需考虑各种情况。

@ghost
Copy link

ghost commented Jun 9, 2023

MySQL 8.0.32 为复现该问题, 可能是8.0允许 snapshot存在是执行此类DDL.

Case 2 但触发如下逻辑导致增量不复制

[DEBUG] client.driver_mgr.dtle: skipping a binlogEntry with the same sid as target.: driver=dtle @module=dtle.applier job=a1 sid=00003307-1111-1111-1111-111111111111 timestamp="2023-06-09T18:30:38.958+0800"

简单移除sid检测机制会导致无限循环.

	if txSid == a.MySQLServerUuid {
		a.logger.Debug("skipping a binlogEntry with the same sid as target.", "sid", txSid)
		skipEntry = true // *HERE*
	} else if a.fwdExtractor != nil {

现状

  • 凡是源MySQL的TX, 即时不在复制范围, 也会在目标端形成一个空TX
  • 当TX sid和目标端为同一sid时, 跳过TX
  • 如果源端和目标端是同一MySQL, 若不进行上述跳过, 则会形成循环复制

方案1

  • 当TX sid和目标端为同一sid, 且含有dtle.gtid_executed事务/或DDL dtle tag时, 才跳过TX

问题

  • 会形成gtid断点

gtid断点方案: 填补in-memory gtid断点, 上传consul (非实时更新)

  • 任务重启时, 从consul gtid继续
  • 之前被跳过的gtid, 再次跳过

@ghost
Copy link

ghost commented Jun 26, 2023

任务初始建表会造成循环复制. 方案

  • 先建表, 等目标端建表完成后, 再获取gtid
  • 需要机制: dest task建表完成后通知src task
    • by consul, 或
    • by nats
  • 考虑使用consul kv/全量进度来表示建表完成 Pause / resume for full copy #365

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant