我需要删除生产数据库中的大量数据,这些数据大约有 100GB。如果可能的话,我想尽量减少停机时间。
我删除的选择标准可能是
从发布中删除 * 其中 USER.ID=5 且 UPDATED_AT<100
删除它的最佳方法是什么?
- 建立索引?
- 编写一个顺序脚本,通过每次分页删除 1000 行?
答案1
通过防止表被锁定,您可以最大限度地减少几乎所有的停机时间。
使用将SELECT INSERT
所需的数据迁移到临时表。
重命名表。
您需要考虑运行INSERTS
期间的任何问题。SELECT INSERT
最后,删除包含您不想要的数据的表。
请注意,如果您的配置中没有innodb_file_per_table
启用此功能,则分配的磁盘空间将不会被释放。
此外,您还需要针对您的申请和具体情况进行特殊考虑。
这是我使用此方法编写的程序。
# Temp table, recreating forty_first_transaction_
#
CREATE TABLE working_table_temp_ LIKE working_table_;
# Increment if there's an auto_increment field.
#
ALTER TABLE working_table_temp_ AUTO_INCREMENT = 15000000;
# Testing inserts and select while this query was running resulted success.
# Verified no active lock.
#
INSERT INTO working_table_temp_ SELECT * FROM working_table_ WHERE id > $NUM;
RENAME TABLE working_table_ TO working_table_old_;
RENAME TABLE working_table_temp_ TO working_table_;
# Verify that all rows were caught
#
# The last row < 15000000 in both tables should be identical.
#
SELECT * FROM working_table_ where id < 15000000 order by id desc limit 5;
SELECT * FROM working_table_old_ where id < 15000000 order by id desc limit 5;
# If not, we need to move them !
#
# This query will need to be modified.
#
INSERT INTO working_table_ SELECT * FROM working_table_old_ WHERE id > 138376577;
# Verify application functionality, if necessary.
#
# LAST CHANCE FOR BACKOUT !!!
#
# Once verified, get rid of the old data to free up some space.
#
DROP TABLE working_table_old_;
^D
df -h
## BACKOUT ##
RENAME TABLE working_table_ TO working_table_new_;
RENAME TABLE working_table_old_ TO working_table_;