Skip to content

数据库维护

本文档介绍 Ape-Volo-Admin 项目的数据库日常维护策略,包括备份恢复、性能监控、数据清理、定期维护等。


� 目录


�🔧 数据库维护

1. 备份与恢复

1.1 全量备份脚本

MySQL 全量备份(Linux/Mac)

bash
#!/bin/bash
# backup_full.sh - MySQL 全量备份脚本

# 配置参数
DB_MAIN="ape_volo_admin_main"     # 业务主库
DB_LOG="ape_volo_admin_log"       # 日志库
DB_USER="root"
DB_PASS="your_password"
BACKUP_DIR="/backup/mysql"
DATE=$(date +"%Y%m%d_%H%M%S")

# 删除3天前的增量备份
find $BACKUP_DIR -name "incremental_*.sql.gz" -mtime +3 -delete

echo "增量备份完成: incremental_$DATE.sql.gz"

**启用 Binlog(my.cnf/my.ini)**

expire_logs_days=7        # 自动删除7天前的binlog

1.3 数据库恢复

全量恢复

bash
# Linux/Mac
gunzip < ape_volo_admin_main_full_20250107_120000.sql.gz | mysql -uroot -p ape_volo_admin_main

# Windows PowerShell
Expand-Archive -Path "ape_volo_admin_main_full_20250107_120000.zip" -DestinationPath "."
& "C:\Program Files\MySQL\MySQL Server 8.0\bin\mysql.exe" -uroot -p ape_volo_admin_main < ape_volo_admin_main_full_20250107_120000.sql

增量恢复(基于 binlog)

bash
# 先恢复全量备份,再恢复增量
mysql -uroot -p ape_volo_admin_main < full_backup.sql
mysqlbinlog incremental_20250107_120000.sql | mysql -uroot -p

指定时间点恢复(PITR)

bash
# 恢复到指定时间点(如发现误删数据前)
mysqlbinlog --start-datetime="2025-01-07 10:00:00" \
  --stop-datetime="2025-01-07 11:30:00" \
  mysql-bin.000001 mysql-bin.000002 | mysql -uroot -p

1.4 SQL Server 备份脚本

sql
-- 全量备份
BACKUP DATABASE ape_volo_admin_main
TO DISK = 'E:\Backup\ape_volo_admin_main_full_20250107.bak'
WITH FORMAT, COMPRESSION, STATS = 10;

-- 差异备份
BACKUP DATABASE ape_volo_admin_main
TO DISK = 'E:\Backup\ape_volo_admin_main_diff_20250107.bak'
WITH DIFFERENTIAL, COMPRESSION, STATS = 10;

-- 事务日志备份
BACKUP LOG ape_volo_admin_main
TO DISK = 'E:\Backup\ape_volo_admin_main_log_20250107.trn'
WITH COMPRESSION, STATS = 10;

-- 恢复数据库
RESTORE DATABASE ape_volo_admin_main
FROM DISK = 'E:\Backup\ape_volo_admin_main_full_20250107.bak'
WITH NORECOVERY;

RESTORE DATABASE ape_volo_admin_main
FROM DISK = 'E:\Backup\ape_volo_admin_main_diff_20250107.bak'
WITH RECOVERY;

2. 性能监控

2.1 慢查询监控

开启慢查询日志

sql
-- MySQL 配置慢查询
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 2;  -- 超过2秒的查询记录

-- 查看慢查询日志路径
SHOW VARIABLES LIKE 'slow_query_log_file';

-- 查看当前配置
SHOW VARIABLES LIKE '%slow_query%';
SHOW VARIABLES LIKE 'long_query_time';

慢查询统计分析

sql
-- 查看最慢的10条查询(按平均耗时排序)
SELECT
    SCHEMA_NAME AS '数据库',
    LEFT(DIGEST_TEXT, 100) AS 'SQL摘要',
    COUNT_STAR AS '执行次数',
    ROUND(AVG_TIMER_WAIT/1000000000, 2) AS '平均耗时(秒)',
    ROUND(MAX_TIMER_WAIT/1000000000, 2) AS '最大耗时(秒)',
    ROUND(SUM_TIMER_WAIT/1000000000, 2) AS '总耗时(秒)',
    ROUND(SUM_ROWS_EXAMINED/COUNT_STAR, 0) AS '平均扫描行数',
    ROUND(SUM_ROWS_SENT/COUNT_STAR, 0) AS '平均返回行数'
FROM performance_schema.events_statements_summary_by_digest
WHERE SCHEMA_NAME IN ('ape_volo_admin_main', 'ape_volo_admin_log')
    AND DIGEST_TEXT IS NOT NULL
ORDER BY AVG_TIMER_WAIT DESC
LIMIT 10;

慢查询日志分析工具

bash
# Linux/Mac: 使用 mysqldumpslow 分析
mysqldumpslow -s at -t 10 /var/log/mysql/slow-query.log

# 参数说明:
# -s at: 按平均查询时间排序
# -s c: 按查询次数排序
# -s r: 按返回记录数排序
# -t 10: 显示前10条

# Windows PowerShell
Get-Content "C:\ProgramData\MySQL\MySQL Server 8.0\Data\slow-query.log" -Tail 50

2.2 连接池监控

MySQL 连接监控

sql
-- 查看当前连接数
SHOW STATUS LIKE 'Threads_connected';
SHOW STATUS LIKE 'Max_used_connections';

-- 查看连接详情
SELECT
    ID AS '会话ID',
    USER AS '用户',
    HOST AS '主机',
    DB AS '数据库',
    COMMAND AS '命令',
    TIME AS '耗时(秒)',
    STATE AS '状态',
    LEFT(INFO, 100) AS 'SQL语句'
FROM information_schema.PROCESSLIST
WHERE DB IN ('ape_volo_admin_main', 'ape_volo_admin_log')
ORDER BY TIME DESC;

-- 查看长时间运行的查询
SELECT
    ID,
    USER,
    HOST,
    DB,
    TIME AS '运行时间(秒)',
    LEFT(INFO, 200) AS 'SQL语句'
FROM information_schema.PROCESSLIST
WHERE COMMAND != 'Sleep'
    AND TIME > 30  -- 超过30秒的查询
ORDER BY TIME DESC;

-- 终止慢查询(谨慎操作)
KILL <会话ID>;

连接池配置建议

json
// appsettings.json - SqlSugar 连接配置
{
  "DataConnection": {
    "DbType": "MySql",
    "ConnectionString": "server=localhost;database=ape_volo_admin_main;uid=root;pwd=password;charset=utf8mb4;SslMode=None;Max Pool Size=100;Min Pool Size=5;Connection Timeout=30;",
    "IsAutoCloseConnection": true
  }
}

连接池参数说明

| 参数 | 说明 | 推荐值 | | Max Pool Size | 最大连接数 | 100-200 | | Min Pool Size | 最小连接数 | 5-10 | | Connection Timeout | 连接超时(秒) | 30 | | Connection Lifetime | 连接生命周期(秒) | 0(不限制) | | IsAutoCloseConnection | SqlSugar 自动释放连接 | true |

2.3 数据库性能指标

MySQL 性能监控

sql
-- 查看数据库大小
SELECT
    table_schema AS '数据库',
    ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) AS '大小(MB)',
    ROUND(SUM(data_length) / 1024 / 1024, 2) AS '数据大小(MB)'
FROM information_schema.tables
WHERE table_schema IN ('ape_volo_admin_main', 'ape_volo_admin_log')
GROUP BY table_schema;

-- 查看表大小(TOP 20)
SELECT
    table_schema AS '数据库',
    table_name AS '表名',
    table_rows AS '行数',
    ROUND(data_length / 1024 / 1024, 2) AS '数据大小(MB)',
    ROUND((data_length + index_length) / 1024 / 1024, 2) AS '总大小(MB)'
FROM information_schema.tables
WHERE table_schema IN ('ape_volo_admin_main', 'ape_volo_admin_log')
ORDER BY (data_length + index_length) DESC
LIMIT 20;

-- 查看缓冲池命中率
SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_read%';
-- 计算命中率 = (1 - Innodb_buffer_pool_reads / Innodb_buffer_pool_read_requests) * 100
-- 目标:>= 99%

-- 查看QPS/TPS
SHOW GLOBAL STATUS LIKE 'Questions';
SHOW GLOBAL STATUS LIKE 'Com_commit';
SHOW GLOBAL STATUS LIKE 'Com_rollback';

3. 数据清理

3.1 日志表归档策略

按月分表的日志清理

sql
-- 查看当前所有日志分表
SELECT
    TABLE_NAME AS '表名',
    TABLE_ROWS AS '行数',
    ROUND((DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024, 2) AS '大小(MB)',
    CREATE_TIME AS '创建时间'
WHERE TABLE_SCHEMA = 'ape_volo_admin_log'
    AND TABLE_NAME LIKE 'log_operate_%'
ORDER BY TABLE_NAME DESC;

-- 删除6个月前的日志分表(谨慎操作,建议先备份)
-- 示例:删除 2024年6月之前的操作日志
DROP TABLE IF EXISTS log_operate_202401;
DROP TABLE IF EXISTS log_operate_202402;
DROP TABLE IF EXISTS log_operate_202403;
DROP TABLE IF EXISTS log_operate_202404;
DROP TABLE IF EXISTS log_operate_202405;
DROP TABLE IF EXISTS log_operate_202406;

-- 或使用循环删除(生成SQL脚本)
SELECT CONCAT('DROP TABLE IF EXISTS ', TABLE_NAME, ';') AS 'SQL脚本'
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'ape_volo_admin_log'
    AND TABLE_NAME LIKE 'log_operate_%'
    AND TABLE_NAME < 'log_operate_202407';  -- 2024年7月之前

归档脚本(Linux/Mac)

bash
#!/bin/bash
# archive_logs.sh - 日志表归档脚本

DB_LOG="ape_volo_admin_log"
DB_USER="root"
DB_PASS="your_password"
ARCHIVE_DIR="/backup/mysql/archive"
ARCHIVE_MONTHS=6  # 归档6个月前的数据

# 创建归档目录
mkdir -p $ARCHIVE_DIR
# 计算6个月前的年月(格式:202401)
CUTOFF_DATE=$(date -d "$ARCHIVE_MONTHS months ago" +"%Y%m")

        $DB_LOG $TABLE > $ARCHIVE_DIR/${TABLE}.sql

    # 压缩
    gzip $ARCHIVE_DIR/${TABLE}.sql

    # 删除原表(谨慎操作,建议先验证归档文件)
    # mysql -u$DB_USER -p$DB_PASS -D$DB_LOG -e "DROP TABLE IF EXISTS $TABLE"

    echo "  - 已归档到: ${TABLE}.sql.gz"
done

echo "归档完成"

3.2 操作日志清理

清理指定天数的操作日志

sql
-- 清理30天前的操作日志(当前月份表)
DELETE FROM log_operate_202501
WHERE CreateTime < DATE_SUB(NOW(), INTERVAL 30 DAY);

-- 批量清理(生成SQL脚本)
SELECT CONCAT(
    'DELETE FROM ', TABLE_NAME,
    ' WHERE CreateTime < ''', DATE_SUB(NOW(), INTERVAL 30 DAY), ''';'
) AS 'SQL脚本'
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'ape_volo_admin_log'
    AND TABLE_NAME LIKE 'log_operate_%'
    AND TABLE_NAME >= CONCAT('log_operate_', DATE_FORMAT(DATE_SUB(NOW(), INTERVAL 6 MONTH), '%Y%m'));

3.3 软删除数据清理

清理软删除的数据(谨慎操作)

sql
-- 查看软删除数据统计
SELECT
    'sys_user' AS '表名',
    COUNT(*) AS '软删除记录数',
    MIN(UpdateTime) AS '最早删除时间',
    MAX(UpdateTime) AS '最近删除时间'
FROM sys_user
WHERE IsDeleted = 1

UNION ALL

SELECT
    'sys_role' AS '表名',
    COUNT(*) AS '软删除记录数',
    MIN(UpdateTime) AS '最早删除时间',
    MAX(UpdateTime) AS '最近删除时间'
FROM sys_role
WHERE IsDeleted = 1

UNION ALL

SELECT
    'sys_menu' AS '表名',
    COUNT(*) AS '软删除记录数',
    MIN(UpdateTime) AS '最早删除时间',
    MAX(UpdateTime) AS '最近删除时间'
FROM sys_menu
WHERE IsDeleted = 1;

-- 清理90天前软删除的用户数据
DELETE FROM sys_user
WHERE IsDeleted = 1
    AND UpdateTime < DATE_SUB(NOW(), INTERVAL 90 DAY);

-- 清理90天前软删除的角色数据
DELETE FROM sys_role
WHERE IsDeleted = 1
    AND UpdateTime < DATE_SUB(NOW(), INTERVAL 90 DAY);

-- 清理90天前软删除的菜单数据
DELETE FROM sys_menu
WHERE IsDeleted = 1
    AND UpdateTime < DATE_SUB(NOW(), INTERVAL 90 DAY);

3.4 Token 黑名单清理

sql
-- 清理过期的 Token 黑名单(已过期的 Token 无需保留)
DELETE FROM sys_token_blacklist
WHERE AccessTokenExpiresTime < NOW();

-- 定期清理(建议每天执行)
CREATE EVENT IF NOT EXISTS clean_expired_tokens
ON SCHEDULE EVERY 1 DAY
STARTS CURRENT_TIMESTAMP
DO
    DELETE FROM sys_token_blacklist
    WHERE AccessTokenExpiresTime < NOW();

3.5 文件记录清理

sql
-- 清理已删除文件的记录(假设文件已从存储中删除)
DELETE FROM sys_file_record
WHERE IsDeleted = 1
    AND UpdateTime < DATE_SUB(NOW(), INTERVAL 180 DAY);

-- 清理临时文件记录(根据业务逻辑调整)
DELETE FROM sys_file_record
WHERE FileName LIKE 'temp_%'
    AND CreateTime < DATE_SUB(NOW(), INTERVAL 7 DAY);

4. 定期维护任务

4.1 每日维护任务

Linux Crontab 配置

bash
# 编辑定时任务
crontab -e

# 每天凌晨2点执行全量备份
0 2 * * * /backup/scripts/backup_full.sh >> /var/log/mysql_backup.log 2>&1

# 每小时执行增量备份
0 * * * * /backup/scripts/backup_incremental.sh >> /var/log/mysql_backup.log 2>&1

# 每天凌晨3点清理过期Token黑名单
0 3 * * * mysql -uroot -pYourPassword -e "DELETE FROM ape_volo_admin_main.sys_token_blacklist WHERE AccessTokenExpiresTime < NOW();"

# 每天凌晨4点分析表统计信息
0 4 * * * mysql -uroot -pYourPassword ape_volo_admin_main -e "ANALYZE TABLE sys_user, sys_role, sys_menu, sys_department, sys_job;"

Windows 任务计划程序

powershell
# 创建每日备份任务
$Action = New-ScheduledTaskAction -Execute "PowerShell.exe" -Argument "-File E:\Backup\Scripts\backup_full.ps1"
$Trigger = New-ScheduledTaskTrigger -Daily -At 2am
$Principal = New-ScheduledTaskPrincipal -UserId "SYSTEM" -LogonType ServiceAccount -RunLevel Highest
Register-ScheduledTask -TaskName "MySQL Full Backup" -Action $Action -Trigger $Trigger -Principal $Principal

4.2 每周维护任务

bash
# 每周日凌晨1点优化表
0 1 * * 0 /backup/scripts/optimize_tables.sh >> /var/log/mysql_maintenance.log 2>&1

optimize_tables.sh

bash
#!/bin/bash
# optimize_tables.sh - 每周优化表脚本

DB_USER="root"
DB_PASS="your_password"

echo "$(date '+%Y-%m-%d %H:%M:%S') - 开始优化表"

# 优化业务主库表
mysql -u$DB_USER -p$DB_PASS -D ape_volo_admin_main -e "
    OPTIMIZE TABLE sys_user, sys_role, sys_menu, sys_department, sys_job,
                   sys_dict, sys_quartz_job, sys_setting, sys_app_secret,
                   sys_file_record, sys_tenant, sys_token_blacklist,
                   queued_email, email_account, email_message_template;
"

echo "$(date '+%Y-%m-%d %H:%M:%S') - 优化完成"

4.3 每月维护任务

bash
# 每月1号凌晨2点归档日志
0 2 1 * * /backup/scripts/archive_logs.sh >> /var/log/mysql_archive.log 2>&1

# 每月1号凌晨5点清理软删除数据
0 5 1 * * /backup/scripts/clean_soft_deleted.sh >> /var/log/mysql_cleanup.log 2>&1

clean_soft_deleted.sh

bash
#!/bin/bash
# clean_soft_deleted.sh - 清理软删除数据

DB_USER="root"
DB_PASS="your_password"

echo "$(date '+%Y-%m-%d %H:%M:%S') - 开始清理软删除数据"

# 清理90天前的软删除数据
mysql -u$DB_USER -p$DB_PASS -D ape_volo_admin_main -e "
    DELETE FROM sys_user WHERE IsDeleted = 1 AND UpdateTime < DATE_SUB(NOW(), INTERVAL 90 DAY);
    DELETE FROM sys_role WHERE IsDeleted = 1 AND UpdateTime < DATE_SUB(NOW(), INTERVAL 90 DAY);
    DELETE FROM sys_menu WHERE IsDeleted = 1 AND UpdateTime < DATE_SUB(NOW(), INTERVAL 90 DAY);
    DELETE FROM sys_department WHERE IsDeleted = 1 AND UpdateTime < DATE_SUB(NOW(), INTERVAL 90 DAY);
    DELETE FROM sys_job WHERE IsDeleted = 1 AND UpdateTime < DATE_SUB(NOW(), INTERVAL 90 DAY);
"

echo "$(date '+%Y-%m-%d %H:%M:%S') - 清理完成"

4.4 MySQL Event 定时任务

sql
-- 启用事件调度器
SET GLOBAL event_scheduler = ON;

-- 每天清理过期Token黑名单
CREATE EVENT IF NOT EXISTS clean_expired_tokens
ON SCHEDULE EVERY 1 DAY
STARTS CURRENT_TIMESTAMP
DO
    DELETE FROM sys_token_blacklist
    WHERE AccessTokenExpiresTime < NOW();

-- 每周日优化表
CREATE EVENT IF NOT EXISTS weekly_optimize_tables
ON SCHEDULE EVERY 1 WEEK
STARTS (TIMESTAMP(CURRENT_DATE) + INTERVAL 1 WEEK - INTERVAL DAYOFWEEK(CURRENT_DATE) DAY + INTERVAL 1 HOUR)
DO
BEGIN
    OPTIMIZE TABLE sys_user, sys_role, sys_menu;
    ANALYZE TABLE sys_user, sys_role, sys_menu;
END;

-- 查看所有事件
SHOW EVENTS FROM ape_volo_admin_main;

-- 禁用事件
ALTER EVENT clean_expired_tokens DISABLE;

-- 删除事件
DROP EVENT IF EXISTS clean_expired_tokens;

5. 故障排查

5.1 常见问题诊断

问题 1:数据库连接超时

sql
-- 检查当前连接数
SHOW STATUS LIKE 'Threads_connected';
SHOW VARIABLES LIKE 'max_connections';

-- 查看等待锁的会话
SELECT
    r.trx_id AS '等待事务ID',
    r.trx_mysql_thread_id AS '等待线程ID',
    r.trx_query AS '等待SQL',
    b.trx_id AS '阻塞事务ID',
    b.trx_mysql_thread_id AS '阻塞线程ID',
    b.trx_query AS '阻塞SQL'
FROM information_schema.innodb_lock_waits w
INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;

-- 终止阻塞会话
KILL <阻塞线程ID>;

解决方案

  • 增加 max_connections 配置(默认 151,建议 300-500)
  • 检查连接池配置(Max Pool Size
  • 优化慢查询,减少长事务
  • 检查是否有死锁或锁等待

问题 2:磁盘空间不足

sql
-- 查看数据库大小
SELECT
    table_schema AS '数据库',
    ROUND(SUM(data_length + index_length) / 1024 / 1024 / 1024, 2) AS '大小(GB)'
FROM information_schema.tables
GROUP BY table_schema
ORDER BY SUM(data_length + index_length) DESC;

-- 查看binlog大小
SELECT
    ROUND(SUM(file_size) / 1024 / 1024 / 1024, 2) AS 'Binlog总大小(GB)'
FROM information_schema.binary_logs;

-- 清理binlog(谨慎操作,确保已备份)
PURGE BINARY LOGS BEFORE DATE_SUB(NOW(), INTERVAL 7 DAY);

解决方案

  • 清理历史日志分表
  • 清理过期的 binlog
  • 优化表碎片(OPTIMIZE TABLE
  • 归档历史数据到冷存储

问题 3:查询性能下降

sql
-- 检查慢查询
SELECT * FROM mysql.slow_log
ORDER BY start_time DESC
LIMIT 10;

-- 检查表统计信息是否过期
SHOW TABLE STATUS FROM ape_volo_admin_main LIKE 'sys_user';

解决方案

  • 分析慢查询日志,优化查询
  • 更新统计信息(ANALYZE TABLE
  • 检查是否有锁等待
  • 优化查询 SQL(避免函数、OR、隐式转换)

问题 4:主从复制延迟

sql
-- 查看主从复制状态
SHOW SLAVE STATUS\G

-- 关键字段:
-- Seconds_Behind_Master: 延迟秒数(NULL表示未同步)
-- Slave_IO_Running: IO线程状态(Yes表示正常)
-- Slave_SQL_Running: SQL线程状态(Yes表示正常)
-- Last_Error: 最后一个错误信息

解决方案

  • 检查网络延迟
  • 优化从库硬件(SSD、增加内存)
  • 开启并行复制(MySQL 5.7+)
  • 检查主库写入压力,考虑读写分离

5.2 性能调优参数

MySQL 配置优化(my.cnf/my.ini)

ini
[mysqld]
# 基础配置
port=3306
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock

# 字符集
character-set-server=utf8mb4
collation-server=utf8mb4_unicode_ci

# InnoDB 配置
innodb_buffer_pool_size=4G          # 设置为物理内存的50-70%
innodb_log_file_size=256M           # 事务日志大小
innodb_log_buffer_size=16M          # 日志缓冲区
innodb_flush_log_at_trx_commit=2    # 性能优先(1=安全,2=性能,0=高性能)
innodb_flush_method=O_DIRECT        # 避免双重缓冲
innodb_file_per_table=ON            # 独立表空间
innodb_io_capacity=2000             # IOPS能力(SSD建议2000-4000)
innodb_read_io_threads=8            # 读IO线程
innodb_write_io_threads=8           # 写IO线程

# 连接配置
max_connections=500                 # 最大连接数
max_connect_errors=1000             # 最大连接错误次数
connect_timeout=30                  # 连接超时
wait_timeout=28800                  # 等待超时(8小时)
interactive_timeout=28800           # 交互超时

# 查询缓存(MySQL 8.0已移除)
# query_cache_type=1
# query_cache_size=128M

# 临时表
tmp_table_size=64M
max_heap_table_size=64M

# 排序缓冲
sort_buffer_size=2M
join_buffer_size=2M
read_buffer_size=2M
read_rnd_buffer_size=4M

# 慢查询日志
slow_query_log=ON
slow_query_log_file=/var/log/mysql/slow-query.log
long_query_time=2

# Binlog 配置
log-bin=mysql-bin
binlog_format=ROW
expire_logs_days=7
max_binlog_size=100M
sync_binlog=1                       # 安全优先(0=高性能,1=安全)

# 其他
max_allowed_packet=64M              # 最大数据包
table_open_cache=4000               # 表缓存
open_files_limit=65535              # 打开文件限制

性能监控指标目标

指标目标值说明
缓冲池命中率≥ 99%内存命中率,越高越好
QPS< 10000每秒查询数,根据业务调整
TPS< 5000每秒事务数,根据业务调整
慢查询占比< 1%慢查询数/总查询数
平均查询耗时< 100ms越低越好
连接使用率< 80%当前连接数/最大连接数
锁等待率< 0.1%锁等待次数/总查询次数
临时表使用率< 10%使用临时表的查询占比

5.3 应急处理流程

步骤 1:快速定位问题

sql
-- 检查当前运行的查询
SHOW FULL PROCESSLIST;

-- 检查锁等待
SELECT * FROM information_schema.innodb_lock_waits;

-- 检查慢查询
SELECT * FROM mysql.slow_log ORDER BY start_time DESC LIMIT 10;

步骤 2:临时缓解措施

sql
-- 终止慢查询(记录会话ID和SQL用于后续分析)
KILL <会话ID>;

-- 临时禁用某个定时任务(如有问题的Quartz任务)
UPDATE sys_quartz_job SET IsEnable = 0 WHERE Name = 'ProblematicJob';

-- 临时增加连接数(重启后失效)
SET GLOBAL max_connections = 1000;

步骤 3:根因分析

  1. 分析慢查询日志,识别问题 SQL
  2. 查看执行计划(EXPLAIN
  3. 检查表统计信息是否过期
  4. 检查业务代码逻辑(N+1 查询、大事务等)

步骤 4:永久解决

  1. 优化 SQL 查询
  2. 拆分大事务
  3. 调整配置参数
  4. 扩容硬件资源

5.4 监控告警配置

Prometheus + Grafana 监控

yaml
# prometheus.yml
scrape_configs:
  - job_name: "mysql"
    static_configs:
      - targets: ["localhost:9104"] # mysqld_exporter
    relabel_configs:
      - source_labels: [__address__]
        target_label: instance
        replacement: "ape-volo-mysql"

告警规则示例

yaml
# mysql_alerts.yml
groups:
  - name: mysql
    interval: 30s
    rules:
      - alert: MySQLDown
        expr: mysql_up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "MySQL实例宕机"
          description: "{{ $labels.instance }} MySQL实例已宕机超过1分钟"

      - alert: MySQLSlowQueries
        expr: rate(mysql_global_status_slow_queries[5m]) > 10
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "慢查询过多"
          description: "{{ $labels.instance }} 最近5分钟慢查询速率超过10/秒"

      - alert: MySQLConnectionsHigh
        expr: mysql_global_status_threads_connected / mysql_global_variables_max_connections > 0.8
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "连接数过高"
          description: "{{ $labels.instance }} 连接使用率超过80%"

      - alert: MySQLReplicationLag
        expr: mysql_slave_status_seconds_behind_master > 60
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "主从复制延迟"
          description: "{{ $labels.instance }} 主从延迟超过60秒"

📚 相关文档


🔖 维护检查清单

每日检查

  • [ ] 检查数据库备份是否成功
  • [ ] 检查慢查询日志,优化 TOP 10 慢查询
  • [ ] 检查磁盘空间使用率(< 80%)
  • [ ] 检查连接数使用率(< 80%)
  • [ ] 检查是否有长时间运行的查询(> 30 秒)

每周检查

  • [ ] 优化表碎片(碎片率 > 20%)
  • [ ] 更新表统计信息(ANALYZE TABLE
  • [ ] 检查 binlog 大小,清理过期 binlog
  • [ ] 检查数据库错误日志

每月检查

  • [ ] 归档 6 个月前的日志分表
  • [ ] 清理 90 天前的软删除数据
  • [ ] 检查数据库大小增长趋势
  • [ ] 检查主从复制状态(如有)
  • [ ] 审查并优化定时任务配置

每季度检查

  • [ ] 审查数据库备份恢复流程
  • [ ] 审查数据库安全配置(用户权限、密码强度)
  • [ ] 性能压测与容量规划
  • [ ] 审查并更新数据库文档
  • [ ] 数据库版本升级评估

版权所有 © 2021-2026 ApeVolo-Team