Python解析JSON文件：提取IAM角色的指定字段需求

阿华AIGC实验室

2026-5-21

Parse IAM Role JSON from OneDrive & Extract Specific Fields with Python

Got it, let's walk through how to fetch your OneDrive-hosted JSON file, parse it, and extract the exact fields you need: RoleName, AttachedManagedPolicies, and Principal.service. Here's a step-by-step solution:

1. Setup Dependencies

First, make sure you have the requests library installed (we'll use it to fetch the file from OneDrive):

pip install requests

2. Full Python Code

This script handles fetching the file, parsing the JSON, and extracting your target fields. It also includes basic error handling for common issues like network failures, invalid JSON, or missing fields:

import requests
import json

def extract_iam_role_fields(onedrive_url):
    try:
        # Fetch the JSON file from OneDrive
        response = requests.get(onedrive_url)
        response.raise_for_status()  # Raise error for HTTP issues (4xx/5xx)
        
        # Parse JSON content into a Python dictionary
        iam_roles = response.json()
        
        # List to store extracted data
        extracted_data = []
        
        # Iterate through each IAM role in the JSON
        for role in iam_roles:
            # Extract RoleName (handle case where it might be missing)
            role_name = role.get("RoleName", "N/A")
            
            # Extract AttachedManagedPolicies (default to empty list if missing)
            attached_policies = role.get("AttachedManagedPolicies", [])
            # Optional: Extract just the policy names if you don't need full objects
            # attached_policy_names = [policy.get("PolicyName") for policy in attached_policies]
            
            # Extract Principal.service from AssumeRolePolicyDocument
            principal_service = "N/A"
            assume_role_doc = role.get("AssumeRolePolicyDocument", {})
            statements = assume_role_doc.get("Statement", [])
            
            # Check each statement for Principal.Service
            for stmt in statements:
                principal = stmt.get("Principal", {})
                if "Service" in principal:
                    principal_service = principal["Service"]
                    # If you only need the first matching service, break here
                    break
            
            # Add extracted data to the list
            extracted_data.append({
                "RoleName": role_name,
                "AttachedManagedPolicies": attached_policies,
                "PrincipalService": principal_service
            })
        
        return extracted_data
    
    except requests.exceptions.RequestException as e:
        print(f"Error fetching the file: {e}")
        return None
    except json.JSONDecodeError as e:
        print(f"Error parsing JSON: {e}")
        return None
    except Exception as e:
        print(f"Unexpected error: {e}")
        return None

# Your OneDrive JSON URL
onedrive_url = "https://1drv.ms/u/s!AizscpxS0QM4hJo5SnYOHAcjng-jww"

# Run the extraction
result = extract_iam_role_fields(onedrive_url)

# Print or process the result
if result:
    for item in result:
        print("\n--- Extracted Role Data ---")
        print(f"Role Name: {item['RoleName']}")
        print(f"Attached Managed Policies: {item['AttachedManagedPolicies']}")
        print(f"Principal Service: {item['PrincipalService']}")

3. Key Notes

Handling Missing Fields: The script uses .get() with fallback values (like "N/A" or empty list) to avoid crashes if a field is missing from some roles.
Multiple Statements: If an IAM role's AssumeRolePolicyDocument has multiple Statement entries, the script grabs the first Principal.Service it finds. If you need all matching services, remove the break statement.
OneDrive URL: The provided link works for direct fetching, but if you run into issues, ensure the link is set to allow public access (or adjust authentication if it's private).

4. Output Example

For the role snippet you provided, the output would look like:

--- Extracted Role Data ---
Role Name: [Your Role's Name]
Attached Managed Policies: []  # Or list of policy objects if present
Principal Service: elasticbeanstalk.amazonaws.com

内容的提问来源于stack exchange，提问作者Milister