Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use multiStatement to apply DML #1462

Merged
merged 6 commits into from
Oct 25, 2024
Merged

Use multiStatement to apply DML #1462

merged 6 commits into from
Oct 25, 2024

Conversation

meiji163
Copy link
Contributor

@meiji163 meiji163 commented Oct 23, 2024

Description

This PR changes the Applier to batch together DML statements into multi-statements in order to reduce the number of network trips in ApplyDMLEventQueries.

During testing on our --test-on-replica setup this provided around 10% increase in rows/second throughput

Rows Modified (rate); dml-batch-size=10
rows modified (rate); dml-batch-size=10

In case this PR introduced Go code changes:

  • contributed code is using same conventions as original code
  • script/cibuild returns with no formatting errors, build errors or unit test errors.

Comment on lines +1222 to +1225
tx, err := conn.BeginTx(ctx, nil)
if err != nil {
return err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have the BEGIN statement be the first in the list of the multistatement and avoid the network roundtrip for this.

SET SESSION can happen before the BEGIN now. The reason why it was done inside of the transaction was to ensure that it happens on the same connection, but we're now explicitly checking out the connection out of the database pool.

// each DML is either a single insert (delta +1), update (delta +0) or delete (delta -1).
// multiplying by the rows actually affected (either 0 or 1) will give an accurate row delta for this DML event
for i, rowsAffected := range mysqlRes.AllRowsAffected() {
totalDelta += rowDeltas[i] * rowsAffected
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we add BEGIN to the multi statement, this calculation here will need to be updated accordingly.

@meiji163 meiji163 marked this pull request as ready for review October 24, 2024 19:05
@arthurschreiber
Copy link
Member

arthurschreiber commented Oct 25, 2024

Screenshot 2024-10-25 at 12 01 51

Another screenshot showing the improvement over a longer time period.

@meiji163 Great work!

@arthurschreiber arthurschreiber merged commit 7c30fb0 into master Oct 25, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants