PostgreSQL大数量异步插入性能远低于MySQL的原因分析及CPU利用率优化咨询

阿华AIGC实验室

2026-4-21

嘿，这个场景我之前帮不少开发者排查过，PostgreSQL在单条大数量异步插入下的表现确实容易比MySQL慢，而且CPU上不去的问题也很典型，咱们一步步来拆解原因和优化方案：

一、为啥PostgreSQL性能拉胯还没跑满CPU？

1. 单条插入的事务开销差异

PostgreSQL对单条INSERT的事务处理开销天生比MySQL更高——默认情况下，每一条插入都会单独开启并提交一个事务，频繁的事务启停会带来额外的IO和锁开销。而MySQL的InnoDB引擎在类似场景下有更高效的自动组提交优化，能把多个小事务合并处理。更关键的是，你的客户端代码里每个插入都是单独异步执行，完全没做批量处理，导致PostgreSQL一直在处理零散的小事务，根本没法集中火力利用CPU。

2. 客户端并发模型的适配问题

你用forEach遍历执行异步插入的写法，其实会瞬间抛出大量并发请求，但pg-pool的max设置为10，意味着同时只有10个连接在工作。虽然MySQL的连接池配置也是10，但mysql2驱动的连接调度更激进，加上MySQL本身对小事务的处理更高效，所以能把CPU跑起来。而PostgreSQL这边，单连接处理小事务的开销更大，再加上默认配置的保守性，直接导致CPU处于“等活干”的状态，利用率上不去。

3. PostgreSQL默认配置太保守

PostgreSQL的出厂配置是为通用场景设计的，非常谨慎，不会主动占用过多系统资源：

shared_buffers默认只占系统内存的1/16（你的8G服务器就是512M），缓存不足导致大量数据要刷到磁盘，CPU大部分时间在等IO
synchronous_commit默认是on，每个事务都要等WAL日志写入磁盘才返回，延迟很高，吞吐量上不去
其他参数比如work_mem、max_worker_processes的默认值也没针对批量插入场景做优化

二、怎么让PostgreSQL跑满CPU，提升插入性能？

1. 先改客户端代码：批量插入替代单条插入

这是最立竿见影的优化！把多条INSERT合并成一个批量语句，一次插几十上百条，大幅减少事务开销，让PostgreSQL能集中处理数据。

修改你的插入函数为批量版本：

async function insert_account_day_setting_batch(insertDataArray){
  // 生成批量插入的占位符，比如($1,$2,...$10), ($11,$12,...$20)...
  const valuesStr = insertDataArray.map((_, idx) => {
    const start = idx * 10 + 1;
    return `(${Array.from({length:10}, (_, i) => `$${start + i}`).join(',')})`;
  }).join(',');
  const allData = insertDataArray.flat();
  return query(`INSERT INTO user_account_day_setting (column1, column2, column3, column4, column5, column6, column7, column8, column9, column10) VALUES ${valuesStr}`, allData);
}

然后调整测试函数，改成分批处理，同时控制并发数（避免一下子压垮数据库）：

async function testing_insert(){
  await pool.query('TRUNCATE TABLE user_account_day_setting')
  console.log('truncate table before insert done')
  const rows = await query(define_sql.select_user_data);
  const hourlyArray = utils.createDayHourArray();
  const insert_total_count = rows.length * hourlyArray.length;
  const t1 = new Date().getTime();

  // 先把所有要插入的数据整理好
  const allInsertData = [];
  rows.forEach((row) => {
    const insertArray = organize_account_day_setting_data(row, hourlyArray);
    allInsertData.push(...insertArray);
  });

  // 分成每50条一批（可以根据服务器性能调整）
  const batchSize = 50;
  const batches = [];
  for(let i=0; i<allInsertData.length; i+=batchSize){
    batches.push(allInsertData.slice(i, i+batchSize));
  }

  // 控制并发数，比如同时处理5个批次
  const concurrency = 5;
  let finish_update_count = 0;
  for(let i=0; i<batches.length; i+=concurrency){
    const currentBatches = batches.slice(i, i+concurrency);
    await Promise.all(currentBatches.map(batch => {
      finish_update_count += batch.length;
      return insert_account_day_setting_batch(batch);
    }));
    console.log(`已完成 ${finish_update_count}/${insert_total_count} 条插入`);
  }

  const t2 = new Date().getTime();
  console.log(`${insert_total_count} inserts and all insert finished after ${(t2-t1)/1000} secs`);
}

2. 调整PostgreSQL服务器配置

修改postgresql.conf文件（路径一般是/var/lib/postgresql/<版本>/main/postgresql.conf），针对你的8核8G服务器调整以下参数：

shared_buffers = 2GB：设置为系统内存的1/4，提升数据缓存能力，减少磁盘IO
work_mem = 32MB：增大排序、哈希操作的临时内存，如果表有索引，索引更新会更高效
maintenance_work_mem = 512MB：加快索引重建等维护操作的速度
synchronous_commit = off：临时关闭同步提交，事务不用等WAL写入磁盘就返回，大幅提升吞吐量（注意：服务器宕机可能丢失最后几个事务，适合可重跑的批量插入场景）
wal_buffers = 64MB：增大WAL缓冲区，减少写入频率
checkpoint_completion_target = 0.9：让检查点更平滑，避免IO峰值
max_worker_processes = 8：保持和CPU核数一致，允许更多后台进程处理任务

修改后重启PostgreSQL服务生效：sudo systemctl restart postgresql