title	summary	category
Data Migration Task Configuration Options	This document introduces the configuration options that apply to Data Migration tasks.	tools

Data Migration Task Configuration Options

This document introduces the configuration options that apply to Data Migration tasks.

`task-mode`

String (full/incremental/all)
The task mode of data migration to be executed
Default value: all
- full: Only makes a full backup of the upstream database and then restores it to the downstream database.
- incremental: Only synchronizes the incremental data of the upstream database to the downstream database using the binlog.
- all: full + incremental. Makes a full backup of the upstream database, imports the full data to the downstream database, and then uses the binlog to make an incremental synchronization to the downstream database starting from the exported position during the full backup process (binlog position/GTID).

Routing rule

# `schema-pattern`/`table-pattern` uses the wildcard matching rule
schema level:
    schema-pattern: "test_*"
    target-schema: "test"

table level:
    schema-pattern: "test_*"
    table-pattern: "t_*"
    target-schema: "test"
    target-table: "t"

Description: Synchronizes the upstream table data that matches schema-pattern/table-pattern to the downstream target-schema/target-table. You can set the routing rule at the schema/table level.

Taking the above code block as an example:

Schema level: Synchronizes all the upstream tables that match the test_* schema to the downstream test schema.

For example, schema: test_1 - tables [a, b, c] => schema:test - tables [a, b, c]
Table level: Synchronizes the t_* matched upstream tables with test_* matched schema to the downstream schema:test table:t table.

Notes:

The table level rule has a higher priority than the schema level rule.

You can set one routing rule at most at one level.

Black and white lists filtering rule

instance:
    do-dbs: ["~^test.*", "do"]         # Starts with "~", indicating it is a regular expression
    ignore-dbs: ["mysql", "ignored"]
    do-tables:
    - db-name: "~^test.*"
      tbl-name: "~^t.*"
    - db-name: "do"
      tbl-name: "do"
    ignore-tables:
    - db-name: "do"
      tbl-name: "do"

The black and white lists filtering rule of the upstream database instances is similar to MySQL replication-rules-db/replication-rules-table.

The filter process is as follows:

Filter at the schema level:
- If do-dbs is not empty, judge whether a matched schema exists in do-dbs.
  - If yes, continue to filter at the table level.
  - If not, ignore it and exit.
- If do-dbs is empty, and ignore-dbs is not empty, judge whether a matched schema exits in ignore-dbs.
  - If yes, ignore it and exit.
  - If not, continue to filter at the table level.
- If both do-dbs and ignore-dbs are empty, continue to filter at the table level.
Filter at the table level:
1. If do-tables is not empty, judge whether a matched rule exists in do-tables.
  - If yes, exit and execute the statement.
  - If not, continue to the next step.
2. If ignore tables is not empty, judge whether a matched rule exists in ignore-tables.
  - If yes, ignore it and exit.
  - If not, continue to the next step.
3. If do-tables is not empty, ignore it and exit. Otherwise, exit and execute the statement.

Filtering rules of binlog events

# table level
user-filter-1:
    schema-pattern: "test_*"     # `schema-pattern`/`table-pattern` uses the wildcard matching rule.
    table-pattern: "t_*"
    events: ["truncate table", "drop table"]
    sql-pattern: ["^DROP\\s+PROCEDURE", "^CREATE\\s+PROCEDURE"]
    action: Ignore

# schema level
user-filter-2:
    schema-pattern: "test_*"
    events: ["All DML"]
    action: Do

Description: Configures the filtering rules for binlog events and DDL SQL statements of the upstream tables that match schema-pattern/table-pattern.

events: the binlog event array

Events	Type	Description
`all`		Includes all the events below
`all dml`		Includes all DML events below
`all ddl`		Includes all DDL events below
`none`		Includes none of the events below
`none ddl`		Includes none of the DDL events below
`none dml`		Includes none of the DML events below
`insert`	DML	The `INSERT` DML event
`update`	DML	The `UPDATE` DML event
`delete`	DML	The `DELETE` DML event
`create database`	DDL	The `CREATE DATABASE` DDL event
`drop database`	DDL	The `DROP DATABASE` DDL event
`create table`	DDL	The `CREATE TABLE` DDL event
`create index`	DDL	The `CREATE INDEX` DDL event
`drop table`	DDL	The `DROP TABLE` DDL event
`truncate table`	DDL	The `TRUNCATE TABLE` DDL event
`rename table`	DDL	The `RENAME TABLE` DDL event
`drop index`	DDL	The `DROP INDEX` DDL event
`alter table`	DDL	The `ALTER TABLE` DDL event

sql-pattern
- Filters a specific DDL SQL statement.
- The matching rule supports using an regular expression, for example, "^DROP\\s+PROCEDURE".

Note: If sql-pattern is empty, no filtering operation is performed. For the filtering rules, see the action description.

action
- String (Do/Ignore)
- For rules that match schema-pattern/table-pattern, judge whether the DDL statement is in the events of the rule or sql-pattern.
  - Black list: If action = Ignore, execute Ignore; otherwise execute Do.
  - White list: If action = Ignore, execute Do; otherwise execute Ignore.

Column mapping rule

instance-1:
    schema-pattern: "test_*"    # `schema-pattern`/`table-pattern` uses the wildcard matching rule
    table-pattern: "t_*"
    expression: "partition id"
    source-column: "id"
    target-column: "id"
    arguments: ["1", "test_", "t_"]
instance-2:
    schema-pattern: "test_*"
    table-pattern: "t_*"
    expression: "partition id"
    source-column: "id"
    target-column: "id"
    arguments: ["2", "test_", "t_"]

Description: the rules for mapping the columns of schema-pattern/table-pattern matched tables in upstream database instances. It is used to resolve the conflicts of auto-increment primary keys of sharded tables.

source-column, target-column: Uses expression to compute the data of source-column as the data of target-column.
expression: The expression used to convert the column data. Currently, only the following built-in expression is supported:
- partition id
  - You need to set arguments to [instance_id, prefix of schema, prefix of table].
    - schema name = arguments[1] + schema ID（suffix of schema）; schema ID == suffix of schema
    - table name = argument[2] + table ID（suffix of table）; table ID == suffix of table
    - If argument[0] == "", the partition ID takes up 0 bit in the figure below; otherwise, it takes up 4 bits (by default)
    - If argument[1] == "", the schema ID takes up 0 bit in the figure below; otherwise, it takes up 7 bits (by default)
    - If argument[2] == "", the table ID takes up 0 bit in the figure below; otherwise, it takes up 8 bits (by default)
    - The origin ID is the value of the auto-increment ID column of a row in the table
  - Restrictions:
    - It is only applicable to the bigint column.
    - The instance ID value should be (>= 0, <= 15) (4 bits by default)
    - The schema ID value should be (>= 0, <= 127) (7 bits by default)
    - The table ID value should be (>= 0, <= 255) (8 bits by default)
    - The origin ID value should be (>= 0, <= 17592186044415) (44 bits by default)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dm-task-config-argument-description.md

dm-task-config-argument-description.md

Data Migration Task Configuration Options

`task-mode`

Routing rule

Black and white lists filtering rule

Filtering rules of binlog events

Column mapping rule

Files

dm-task-config-argument-description.md

Latest commit

History

dm-task-config-argument-description.md

File metadata and controls

Data Migration Task Configuration Options

task-mode

Routing rule

Black and white lists filtering rule

Filtering rules of binlog events

Column mapping rule

`task-mode`