Automatic Recovery after Extended Primary/Standby Replication Delay

2024-06-24 09:59:13

Scenario Description

The primary/standby replication delay of a MySQL instance is long, keeps increasing for a period of time, and is then automatically recovered during off-peak hours.

Possible Causes

Check whether the database primary/standby replication delay increases in a certain period and keeps increasing for a period of time, and then is automatically recovered during off-peak hours. This scenario may be caused by a large transaction, peak hours, or DDL operations. View the slow logs from the console, analyze slow SQL statements, and observe transaction records to see if there are large transactions or DDL operations.

Check the console and analyze the slow query logs and statements. When using the index to optimize the SQL statement, consider the service peaks. After checking the event records on the console, it was found that the service was at its peak hours with a great number of insert operations, while DDL operations were being performed such as adding or deleting the indexes. The table contains tens of millions of data and the execution took a long time. As a result, the replication delay kept increasing when the DDL operation was replayed on the slave node. After the replay was complete, the replication delay returned to normal.

Solution

• This issue is not an exception. Wait until the DDL operation is complete and the delay will be automatically restored.

• It is recommended that you add and delete indexes during off-peak hours because index changes involve the entire table.

Relational Database Service (MySQL)

Automatic Recovery after Extended Primary/Standby Replication Delay

Scenario Description

Possible Causes

Solution