如何用类Redshift COPY命令将S3中CSV数据导入AWS RDS MySQL？

阿华AIGC实验室

2026-5-21

Hey there! Great question—unlike Redshift which has that handy COPY command built right in, RDS MySQL (non-Aurora) doesn't have an exact equivalent out of the box, but there are two solid approaches to load CSV data from S3 directly into your tables. Let's break them down:

Approach 1: Use LOAD DATA FROM S3 (MySQL 8.0.17+ on RDS)

This is the closest you'll get to Redshift's COPY command, since it pulls data directly from S3 without needing to download files to an intermediate server. Here's how to set it up:

Check your MySQL version: This feature was introduced in MySQL 8.0.17, so make sure your RDS instance is running this version or newer. Verify with SELECT VERSION(); in your MySQL client.
Attach an IAM role to your RDS instance:
1. Create an IAM role with permissions to read from your target S3 bucket. Use a minimal policy like this (restrict to your specific bucket/prefix for better security):
```
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::your-bucket-name",
        "arn:aws:s3:::your-bucket-name/*"
      ]
    }
  ]
}
```
2. In the RDS Console, go to your instance's Configuration tab, find IAM roles, and add the role you just created.
Prepare your target table: Ensure the table schema matches your CSV's column order, data types, and constraints (like primary keys).

Run the load command:

LOAD DATA FROM S3 's3://your-bucket-name/path/to/your/data.csv'
INTO TABLE your_target_table
FIELDS TERMINATED BY ',' 
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS; -- Include this if your CSV has a header row

Adjust the FIELDS and LINES parameters to match your CSV's formatting (e.g., use '\r\n' for Windows-style line breaks).

Approach 2: Use LOAD DATA LOCAL INFILE + AWS CLI

If your RDS MySQL version is older than 8.0.17, this method works by first downloading the S3 file to your local machine, then loading it into MySQL:

Enable local_infile on your RDS instance:
1. Go to the RDS Console, open your instance's parameter group, and set the local_infile parameter to 1.
2. Restart your RDS instance if the parameter is marked as "static" (a note will indicate this in the parameter group).
Download the CSV from S3: Use the AWS CLI to pull the file to your local machine:
```
aws s3 cp s3://your-bucket-name/path/to/your/data.csv ./local-data.csv
```
Connect to MySQL with local file support: When launching your MySQL client, include the --local-infile=1 flag to allow local file loading:
```
mysql -h your-rds-endpoint.example.com -u your-username -p --local-infile=1
```

Run the local load command:

LOAD DATA LOCAL INFILE '/path/to/local-data.csv'
INTO TABLE your_target_table
FIELDS TERMINATED BY ',' 
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

Note: Ensure your local machine can reach your RDS instance (check security group rules to allow inbound traffic on port 3306 from your IP).

Pro Tips

Test first: Load a small sample of your CSV into a temporary table to validate formatting and schema matches before loading the full dataset.
Optimize performance: For large datasets, disable autocommit (SET autocommit = 0;) before loading, commit after the load completes, and consider increasing innodb_buffer_pool_size in your parameter group if you have enough memory allocated to your RDS instance.
Security best practices: For Approach 1, restrict the IAM role's S3 permissions to only the necessary bucket/prefix. For Approach 2, only use local_infile in trusted environments, as it can pose security risks if misused.

内容的提问来源于stack exchange，提问作者Arnold