如何用Python(Boto3)构建GitHub源S3部署的CI/CD流水线?选Boto3还是CDK?
Hey there! Let's break down your questions step by step—first, choosing between Boto3 and AWS CDK, then diving into working examples for both tools.
Tool Choice: AWS CDK vs Boto3
First off, AWS CDK is almost always the better choice for building and maintaining a CI/CD pipeline like the one you described. Here's why:
- CDK lets you define your infrastructure in Python (your preferred language) using a declarative, human-readable syntax. It automatically handles dependencies between resources (like ensuring IAM roles have the right permissions) and generates CloudFormation templates under the hood, so you don't have to manually stitch together every API call.
- Boto3 is a low-level SDK that requires you to manually configure every single detail of your pipeline—from IAM roles to stage actions to artifact storage. The official docs are sparse on complete examples, which makes it easy to miss critical configurations (like correct permissions for GitHub or S3 access).
That said, Boto3 can be useful for one-off scripts or simple pipeline modifications, but for building a full GitHub-to-S3 pipeline from scratch, CDK will save you hours of troubleshooting and make your pipeline easier to maintain long-term.
AWS CDK Example: GitHub → S3 Pipeline
Let's walk through a complete, working CDK example. First, make sure you have the CDK CLI installed, then set up your project:
- Initialize a new CDK project:
mkdir github-s3-pipeline && cd github-s3-pipeline cdk init app --language python source .venv/bin/activate pip install aws-cdk-lib aws_cdk.aws_codepipeline aws_codepipeline_actions aws_cdk.aws_iam aws_cdk.aws_s3
- Replace the code in
github_s3_pipeline_stack.pywith this:
from aws_cdk import ( Stack, aws_codepipeline as codepipeline, aws_codepipeline_actions as actions, aws_s3 as s3, SecretValue, ) from constructs import Construct class GithubS3PipelineStack(Stack): def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None: super().__init__(scope, construct_id, **kwargs) # 1. Create or reference your deployment S3 bucket # Replace "your-unique-deployment-bucket" with a unique bucket name deploy_bucket = s3.Bucket( self, "DeploymentBucket", bucket_name="your-unique-deployment-bucket", versioned=True, removal_policy=None # Keep this if you want to retain the bucket when deleting the stack ) # 2. Create the CodePipeline instance pipeline = codepipeline.Pipeline( self, "GithubToS3Pipeline", pipeline_name="GitHub-To-S3-Deployment-Pipeline" ) # 3. Add the Source Stage (GitHub) source_artifact = codepipeline.Artifact() source_stage = pipeline.add_stage(stage_name="Source") source_stage.add_action(actions.GitHubSourceAction( action_name="Pull_From_GitHub", owner="your-github-username", # Replace with your GitHub username repo="your-repository-name", # Replace with your repo name branch="main", # Replace with your target branch # Store your GitHub OAuth token in AWS Secrets Manager first oauth_token=SecretValue.secrets_manager("github-oauth-token"), output=source_artifact )) # 4. Add the Deployment Stage (S3) deploy_stage = pipeline.add_stage(stage_name="Deploy_To_S3") deploy_stage.add_action(actions.S3DeployAction( action_name="Deploy_To_Bucket", bucket=deploy_bucket, input=source_artifact, extract=True # Automatically unzips the source artifact into the bucket ))
- Deploy the pipeline:
- First, store your GitHub OAuth token (with repo access) in AWS Secrets Manager under the name
github-oauth-token. - Run these commands:
cdk bootstrap # Only needed if you're deploying to this AWS region for the first time cdk deploy
Boto3 Example: Create GitHub → S3 Pipeline
If you still need to use Boto3, here's a complete example. Note that you'll need to handle IAM roles and permissions manually:
import boto3 import json # Initialize clients codepipeline_client = boto3.client('codepipeline') iam_client = boto3.client('iam') secrets_manager_client = boto3.client('secretsmanager') # 1. Create or retrieve the CodePipeline service role role_name = "CodePipeline-Service-Role" try: role = iam_client.get_role(RoleName=role_name) except iam_client.exceptions.NoSuchEntityException: # Create role with trust policy for CodePipeline role = iam_client.create_role( RoleName=role_name, AssumeRolePolicyDocument=json.dumps({ "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Principal": {"Service": "codepipeline.amazonaws.com"}, "Action": "sts:AssumeRole" }] }) ) # Attach necessary policies iam_client.attach_role_policy( RoleName=role_name, PolicyArn="arn:aws:iam::aws:policy/AWSCodePipelineFullAccess" ) iam_client.attach_role_policy( RoleName=role_name, PolicyArn="arn:aws:iam::aws:policy/AmazonS3FullAccess" ) role_arn = role['Role']['Arn'] # 2. Retrieve GitHub OAuth token from Secrets Manager (don't hardcode this!) github_token = secrets_manager_client.get_secret_value( SecretId="github-oauth-token" )['SecretString'] # 3. Define and create the pipeline pipeline_response = codepipeline_client.create_pipeline( pipeline={ "name": "GitHub-To-S3-Pipeline-Boto3", "roleArn": role_arn, "artifactStore": { "type": "S3", "location": "your-artifact-storage-bucket" # Replace with a bucket for pipeline artifacts }, "stages": [ { "name": "Source", "actions": [{ "name": "GitHub_Source", "actionTypeId": { "category": "Source", "owner": "ThirdParty", "provider": "GitHub", "version": "1" }, "runOrder": 1, "configuration": { "Owner": "your-github-username", "Repo": "your-repository-name", "Branch": "main", "OAuthToken": github_token }, "outputArtifacts": [{"name": "SourceArtifact"}] }] }, { "name": "Deploy", "actions": [{ "name": "S3_Deploy", "actionTypeId": { "category": "Deploy", "owner": "AWS", "provider": "S3", "version": "1" }, "runOrder": 1, "configuration": { "BucketName": "your-unique-deployment-bucket", "Extract": "true" }, "inputArtifacts": [{"name": "SourceArtifact"}] }] } ] } ) print(f"Pipeline created successfully! ARN: {pipeline_response['pipeline']['arn']}")
Key Notes for Boto3:
- Never hardcode your GitHub token—always use Secrets Manager to retrieve it securely.
- The
artifactStorebucket is separate from your deployment bucket; it's used to store temporary pipeline artifacts.
Where to Find More Examples
- For CDK: The official AWS CDK Developer Guide has tons of beginner-friendly examples for CodePipeline, including different source and deployment targets.
- For Boto3: While the official docs lack full examples, you can piece together code using the CodePipeline API reference, combined with IAM permission best practices for pipeline roles.
内容的提问来源于stack exchange,提问作者KrisTej




