如何在AWS中实现RDS(MySQL)每日自动备份恢复至另一RDS实例
Hey there! You’re absolutely right that this can be fully automated using AWS’s native services—let me walk you through the most straightforward, no-external-tools approach:
Core Services to Leverage
- AWS EventBridge: Triggers the automation on a daily schedule (or the moment a new snapshot is ready)
- AWS Lambda: Runs the actual snapshot lookup and restore logic via AWS APIs
- AWS IAM: Grants the necessary permissions for Lambda to interact with your RDS resources
Step-by-Step Implementation
1. Set Up an IAM Role for Lambda
First, create an IAM role that Lambda can assume to perform RDS actions. Attach a policy like this (customize the resource ARNs to match your source/target instances and region):
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "rds:DescribeDBSnapshots", "rds:RestoreDBInstanceFromDBSnapshot", "rds:DeleteDBInstance" // Optional, only if you need to replace an existing target instance ], "Resource": [ "arn:aws:rds:your-region:your-account-id:db:source-db-name", "arn:aws:rds:your-region:your-account-id:snapshot:*", "arn:aws:rds:your-region:your-account-id:db:target-db-name" ] }, { "Effect": "Allow", "Action": "logs:CreateLogGroup", "Resource": "arn:aws:logs:your-region:your-account-id:*" }, { "Effect": "Allow", "Action": [ "logs:CreateLogStream", "logs:PutLogEvents" ], "Resource": "arn:aws:logs:your-region:your-account-id:log-group:/aws/lambda/your-lambda-name:*" } ] }
2. Build the Lambda Function
Write a Lambda function (Python is a popular choice here) that handles the snapshot lookup and restore. Here’s a simplified, working snippet to get you started:
import boto3 import time rds = boto3.client('rds') # Replace these with your actual resource details SOURCE_DB_ID = "your-source-rds-instance" TARGET_DB_ID = "your-target-rds-instance" TARGET_INSTANCE_CLASS = "db.t3.medium" # Match your app's resource needs TARGET_SECURITY_GROUPS = ["sg-xxxxxxxx"] # Your target VPC security groups def lambda_handler(event, context): # Fetch all automated snapshots from the source instance snapshots = rds.describe_db_snapshots( DBInstanceIdentifier=SOURCE_DB_ID, SnapshotType='automated' )['DBSnapshots'] # Grab the newest snapshot by creation time latest_snapshot = sorted(snapshots, key=lambda x: x['SnapshotCreateTime'], reverse=True)[0] snapshot_id = latest_snapshot['DBSnapshotIdentifier'] print(f"Found latest automated snapshot: {snapshot_id}") # Optional: Clean up existing target instance (skip if you want to retain old instances) try: rds.delete_db_instance( DBInstanceIdentifier=TARGET_DB_ID, SkipFinalSnapshot=True ) # Wait for deletion to finish before restoring while True: try: rds.describe_db_instances(DBInstanceIdentifier=TARGET_DB_ID) time.sleep(30) except rds.exceptions.DBInstanceNotFoundFault: print("Old target instance deleted, proceeding to restore") break except rds.exceptions.DBInstanceNotFoundFault: print("No existing target instance found, starting restore") # Restore the snapshot to the target instance configuration restore_response = rds.restore_db_instance_from_db_snapshot( DBInstanceIdentifier=TARGET_DB_ID, DBSnapshotIdentifier=snapshot_id, DBInstanceClass=TARGET_INSTANCE_CLASS, VpcSecurityGroupIds=TARGET_SECURITY_GROUPS, PubliclyAccessible=False, # Adjust based on your security requirements MultiAZ=False # Toggle if you need high availability for the target ) print(f"Restore initiated successfully for target instance: {TARGET_DB_ID}") return {"status": "success", "snapshot_used": snapshot_id}
3. Schedule the Automation with EventBridge
Create an EventBridge rule to trigger your Lambda function on your desired schedule:
- Head to EventBridge > Rules > Create rule
- Choose "Schedule" as the rule type, then set a cron expression (e.g.,
cron(0 2 * * ? *)for daily 2 AM UTC) - Select your Lambda function as the target
Alternatively, you can trigger the rule the moment a new automated snapshot is created:
- Choose "Event pattern" instead of schedule
- Select "AWS services" as the source, pick "RDS" as the service, then "DB Snapshot Event" as the event type
- Add filters for
automatedsnapshot type and your source DB instance ID
4. Test and Monitor
- Run the Lambda function manually first to validate it works as expected
- Check CloudWatch Logs for Lambda to debug any errors during execution
- Set up CloudWatch Alarms to alert you if the restore process fails (e.g., if Lambda throws an unhandled exception)
Alternative: AWS Step Functions (For Complex Workflows)
If you need more control—like waiting for the target instance to become fully available, sending status notifications via SNS, or handling retries—use Step Functions to orchestrate the workflow. You can define states for:
- Fetching the latest snapshot
- Deleting the old target instance
- Restoring the snapshot
- Waiting for the target instance to reach
availablestatus - Sending success/failure alerts to your team
This makes the workflow more transparent and easier to modify as your needs change.
内容的提问来源于stack exchange,提问作者Ross




