如何在AWS中实现RDS（MySQL）每日自动备份恢复至另一RDS实例

阿华AIGC实验室

2026-5-13

Automated RDS Snapshot Restore to Another Instance (AWS Native)

Hey there! You’re absolutely right that this can be fully automated using AWS’s native services—let me walk you through the most straightforward, no-external-tools approach:

Core Services to Leverage

AWS EventBridge: Triggers the automation on a daily schedule (or the moment a new snapshot is ready)
AWS Lambda: Runs the actual snapshot lookup and restore logic via AWS APIs
AWS IAM: Grants the necessary permissions for Lambda to interact with your RDS resources

Step-by-Step Implementation

1. Set Up an IAM Role for Lambda

First, create an IAM role that Lambda can assume to perform RDS actions. Attach a policy like this (customize the resource ARNs to match your source/target instances and region):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "rds:DescribeDBSnapshots",
                "rds:RestoreDBInstanceFromDBSnapshot",
                "rds:DeleteDBInstance" // Optional, only if you need to replace an existing target instance
            ],
            "Resource": [
                "arn:aws:rds:your-region:your-account-id:db:source-db-name",
                "arn:aws:rds:your-region:your-account-id:snapshot:*",
                "arn:aws:rds:your-region:your-account-id:db:target-db-name"
            ]
        },
        {
            "Effect": "Allow",
            "Action": "logs:CreateLogGroup",
            "Resource": "arn:aws:logs:your-region:your-account-id:*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "arn:aws:logs:your-region:your-account-id:log-group:/aws/lambda/your-lambda-name:*"
        }
    ]
}

2. Build the Lambda Function

Write a Lambda function (Python is a popular choice here) that handles the snapshot lookup and restore. Here’s a simplified, working snippet to get you started:

import boto3
import time

rds = boto3.client('rds')

# Replace these with your actual resource details
SOURCE_DB_ID = "your-source-rds-instance"
TARGET_DB_ID = "your-target-rds-instance"
TARGET_INSTANCE_CLASS = "db.t3.medium"  # Match your app's resource needs
TARGET_SECURITY_GROUPS = ["sg-xxxxxxxx"]  # Your target VPC security groups

def lambda_handler(event, context):
    # Fetch all automated snapshots from the source instance
    snapshots = rds.describe_db_snapshots(
        DBInstanceIdentifier=SOURCE_DB_ID,
        SnapshotType='automated'
    )['DBSnapshots']
    
    # Grab the newest snapshot by creation time
    latest_snapshot = sorted(snapshots, key=lambda x: x['SnapshotCreateTime'], reverse=True)[0]
    snapshot_id = latest_snapshot['DBSnapshotIdentifier']
    print(f"Found latest automated snapshot: {snapshot_id}")

    # Optional: Clean up existing target instance (skip if you want to retain old instances)
    try:
        rds.delete_db_instance(
            DBInstanceIdentifier=TARGET_DB_ID,
            SkipFinalSnapshot=True
        )
        # Wait for deletion to finish before restoring
        while True:
            try:
                rds.describe_db_instances(DBInstanceIdentifier=TARGET_DB_ID)
                time.sleep(30)
            except rds.exceptions.DBInstanceNotFoundFault:
                print("Old target instance deleted, proceeding to restore")
                break
    except rds.exceptions.DBInstanceNotFoundFault:
        print("No existing target instance found, starting restore")

    # Restore the snapshot to the target instance configuration
    restore_response = rds.restore_db_instance_from_db_snapshot(
        DBInstanceIdentifier=TARGET_DB_ID,
        DBSnapshotIdentifier=snapshot_id,
        DBInstanceClass=TARGET_INSTANCE_CLASS,
        VpcSecurityGroupIds=TARGET_SECURITY_GROUPS,
        PubliclyAccessible=False,  # Adjust based on your security requirements
        MultiAZ=False  # Toggle if you need high availability for the target
    )

    print(f"Restore initiated successfully for target instance: {TARGET_DB_ID}")
    return {"status": "success", "snapshot_used": snapshot_id}

3. Schedule the Automation with EventBridge

Create an EventBridge rule to trigger your Lambda function on your desired schedule:

Head to EventBridge > Rules > Create rule
Choose "Schedule" as the rule type, then set a cron expression (e.g., cron(0 2 * * ? *) for daily 2 AM UTC)
Select your Lambda function as the target

Alternatively, you can trigger the rule the moment a new automated snapshot is created:

Choose "Event pattern" instead of schedule
Select "AWS services" as the source, pick "RDS" as the service, then "DB Snapshot Event" as the event type
Add filters for automated snapshot type and your source DB instance ID

4. Test and Monitor

Run the Lambda function manually first to validate it works as expected
Check CloudWatch Logs for Lambda to debug any errors during execution
Set up CloudWatch Alarms to alert you if the restore process fails (e.g., if Lambda throws an unhandled exception)

Alternative: AWS Step Functions (For Complex Workflows)

If you need more control—like waiting for the target instance to become fully available, sending status notifications via SNS, or handling retries—use Step Functions to orchestrate the workflow. You can define states for: