You need to enable JavaScript to run this app.
优惠活动
大模型
产品
解决方案
定价
更多
文档控制台
免费开始使用

跨AWS账号切换角色后获取EKS集群CA证书与JWT Token失败求助

跨AWS账号切换IAM角色操作EKS集群时出现未授权错误

编写了Bash脚本用于跨AWS账号切换IAM角色,列出EKS集群、更新kubeconfig并获取JWT Token和CA证书,但执行时触发未授权错误,无法完成操作。脚本及错误信息如下:

#!/bin/bash

# Set AWS region
AWS_REGION="us-east-1"
INPUT_FILE="inputfile"

# Function to assume role and get temporary credentials
assume_role() {
    local account_id=$1
    local account_name=$2
    local role_arn="arn:aws:iam::${account_id}:role/support-admin"
    local session_name="${account_id}"

    echo "Assuming role for account: ${account_name} (${account_id})"

    # Assume role and get temporary credentials
    credentials=$(aws sts assume-role --role-arn "${role_arn}" --role-session-name "${session_name}")

    if [ $? -ne 0 ]; then
        echo "Cannot assume role for account: ${account_id} (${account_name})"
        echo "--------------------------------------------------------------------"
        return
    fi

    export AWS_ACCESS_KEY_ID=$(echo "${credentials}" | jq -r '.Credentials.AccessKeyId')
    export AWS_SECRET_ACCESS_KEY=$(echo "${credentials}" | jq -r '.Credentials.SecretAccessKey')
    export AWS_SESSION_TOKEN=$(echo "${credentials}" | jq -r '.Credentials.SessionToken')

    list_eks_clusters "${account_name}"
    echo "--------------------------------------------------------------------"
}

# Function to list EKS clusters
list_eks_clusters() {
    local account_name=$1

    echo "Listing EKS clusters for account: ${account_name}"

    clusters=$(aws eks list-clusters --region "${AWS_REGION}" | jq -r '.clusters[]')

    if [ $? -ne 0 ]; then
        echo "Error listing EKS clusters for account: ${account_name}"
        return
    fi

    for cluster in ${clusters}; do
        echo "Cluster: ${cluster}"
        # describe_cluster "${cluster}"
    done
}

# Function to describe EKS cluster, update kubeconfig, and get CA certificate
describe_cluster() {
    local cluster_name=$1

    echo "Describing cluster: ${cluster_name}"

    cluster_info=$(aws eks describe-cluster --name "${cluster_name}" --region "${AWS_REGION}")

    if [ $? -ne 0 ]; then
        echo "Error describing cluster: ${cluster_name}"
        return
    fi

    cluster_endpoint=$(echo "${cluster_info}" | jq -r '.cluster.endpoint')
    echo "Cluster ${cluster_name} endpoint: ${cluster_endpoint}"
    
    # Update kubeconfig for the cluster and write to custom location
    CUSTOM_KUBECONFIG="kubeconfig"
    aws eks --region "${AWS_REGION}" update-kubeconfig --name "${cluster_name}" --kubeconfig "${CUSTOM_KUBECONFIG}"

    # Get the authentication token for the cluster

    export KUBECONFIG=kubeconfig
    kubectl config view --kubeconfig="${KUBECONFIG}"

    token=$(aws eks get-token --cluster-name "${cluster_name}" --region "${AWS_REGION}" | jq -r '.status.token')
    echo "Token: ${token}"
    kubectl config set-credentials arn:aws:eks:${AWS_REGION}:${account_id}:cluster/${cluster_name} --token="${token}"

    # Ensure the kubeconfig context is set correctly
    kubectl config use-context arn:aws:eks:${AWS_REGION}:${account_id}:cluster/${cluster_name}

    # Get all namespaces from the cluster using the token - works this here 
    kubectl --kubeconfig="${KUBECONFIG}" get namespaces

    
    # Get the CA certificate from the vault-token secret in the kube-system namespace - dont get exact ca cert of cluster selected
    KUBE_CA_CERT=$(kubectl get secret vault-token -n kube-system -o json --token="${token}" | jq -r '.data | ."ca.crt"' | base64 --decode)
    echo "KUBE_CA_CERT: ${KUBE_CA_CERT}"
    echo "--------------------------------------------------------------------"
}

# Read accounts from input file and assume role for each account
if [ ! -f "${INPUT_FILE}" ]; then
    echo "Input file not found: ${INPUT_FILE}"
    exit 1
fi

while IFS=, read -r account_id account_name; do
    assume_role "${account_id}" "${account_name}"
done < "${INPUT_FILE}"

错误信息(中文翻译)

已切换至上下文 "arn:aws:eks:us-east-1:***:cluster/eks7-"。
E0213 16:15:03.105457 57722 memcache.go:265] "未处理的错误" err="无法获取当前服务器API组列表:服务器要求客户端提供凭证"
E0213 16:15:03.188119 57722 memcache.go:265] "未处理的错误" err="无法获取当前服务器API组列表:服务器要求客户端提供凭证"
E0213 16:15:03.290776 57722 memcache.go:265] "未处理的错误" err="无法获取当前服务器API组列表:服务器要求客户端提供凭证"
错误:您必须登录服务器(未授权)


问题分析与修复方案

核心问题

  1. 变量作用域缺失describe_cluster函数中使用的account_idassume_role的局部变量,无法跨函数访问,导致kubeconfig中凭证关联的账号ID错误。
  2. kubeconfig配置冗余aws eks update-kubeconfig已经自动配置了通过AWS CLI动态获取token的逻辑,手动硬编码token会引发冲突,且token有有效期限制。
  3. CA证书获取方式错误:从vault-token secret中获取的不是EKS集群的CA证书,正确的集群CA证书可直接从aws eks describe-cluster结果中提取,无需通过kubectl访问集群。

修复后的关键调整

1. 传递account_id到describe_cluster函数

修改list_eks_clusters函数,新增account_id参数并传递给describe_cluster

# Function to list EKS clusters
list_eks_clusters() {
    local account_name=$1
    local account_id=$2

    echo "Listing EKS clusters for account: ${account_name}"

    clusters=$(aws eks list-clusters --region "${AWS_REGION}" | jq -r '.clusters[]')

    if [ $? -ne 0 ]; then
        echo "Error listing EKS clusters for account: ${account_name}"
        return
    fi

    for cluster in ${clusters}; do
        echo "Cluster: ${cluster}"
        describe_cluster "${cluster}" "${account_id}"
    done
}

同时修改assume_role中调用list_eks_clusters的语句:

list_eks_clusters "${account_name}" "${account_id}"

2. 简化kubeconfig配置,移除手动token设置

删除手动获取并设置token的冗余代码,保留aws eks update-kubeconfig的自动配置:

# 移除以下代码块
# token=$(aws eks get-token --cluster-name "${cluster_name}" --region "${AWS_REGION}" | jq -r '.status.token')
# echo "Token: ${token}"
# kubectl config set-credentials arn:aws:eks:${AWS_REGION}:${account_id}:cluster/${cluster_name} --token="${token}"

3. 从describe-cluster结果中直接获取CA证书

替换从secret获取CA的逻辑,直接提取集群官方CA证书:

# 获取集群CA证书
KUBE_CA_CERT=$(echo "${cluster_info}" | jq -r '.cluster.certificateAuthority.data' | base64 --decode)
echo "KUBE_CA_CERT: ${KUBE_CA_CERT}"

4. 确保kubectl使用指定kubeconfig

执行kubectl命令时明确指定kubeconfig路径,避免环境变量冲突:

kubectl --kubeconfig="${CUSTOM_KUBECONFIG}" get namespaces

完整修复后的脚本

#!/bin/bash

# Set AWS region
AWS_REGION="us-east-1"
INPUT_FILE="inputfile"

# Function to assume role and get temporary credentials
assume_role() {
    local account_id=$1
    local account_name=$2
    local role_arn="arn:aws:iam::${account_id}:role/support-admin"
    local session_name="${account_id}"

    echo "Assuming role for account: ${account_name} (${account_id})"

    # Assume role and get temporary credentials
    credentials=$(aws sts assume-role --role-arn "${role_arn}" --role-session-name "${session_name}")

    if [ $? -ne 0 ]; then
        echo "Cannot assume role for account: ${account_id} (${account_name})"
        echo "--------------------------------------------------------------------"
        return
    fi

    export AWS_ACCESS_KEY_ID=$(echo "${credentials}" | jq -r '.Credentials.AccessKeyId')
    export AWS_SECRET_ACCESS_KEY=$(echo "${credentials}" | jq -r '.Credentials.SecretAccessKey')
    export AWS_SESSION_TOKEN=$(echo "${credentials}" | jq -r '.Credentials.SessionToken')

    list_eks_clusters "${account_name}" "${account_id}"
    echo "--------------------------------------------------------------------"
}

# Function to list EKS clusters
list_eks_clusters() {
    local account_name=$1
    local account_id=$2

    echo "Listing EKS clusters for account: ${account_name}"

    clusters=$(aws eks list-clusters --region "${AWS_REGION}" | jq -r '.clusters[]')

    if [ $? -ne 0 ]; then
        echo "Error listing EKS clusters for account: ${account_name}"
        return
    fi

    for cluster in ${clusters}; do
        echo "Cluster: ${cluster}"
        describe_cluster "${cluster}" "${account_id}"
    done
}

# Function to describe EKS cluster, update kubeconfig, and get CA certificate
describe_cluster() {
    local cluster_name=$1
    local account_id=$2

    echo "Describing cluster: ${cluster_name}"

    cluster_info=$(aws eks describe-cluster --name "${cluster_name}" --region "${AWS_REGION}")

    if [ $? -ne 0 ]; then
        echo "Error describing cluster: ${cluster_name}"
        return
    fi

    cluster_endpoint=$(echo "${cluster_info}" | jq -r '.cluster.endpoint')
    echo "Cluster ${cluster_name} endpoint: ${cluster_endpoint}"
    
    # Update kubeconfig for the cluster and write to custom location
    CUSTOM_KUBECONFIG="kubeconfig_${cluster_name}"
    aws eks --region "${AWS_REGION}" update-kubeconfig --name "${cluster_name}" --kubeconfig "${CUSTOM_KUBECONFIG}"

    export KUBECONFIG="${CUSTOM_KUBECONFIG}"
    kubectl config view --kubeconfig="${KUBECONFIG}"

    # Ensure the kubeconfig context is set correctly
    kubectl config use-context arn:aws:eks:${AWS_REGION}:${account_id}:cluster/${cluster_name}

    # Get all namespaces from the cluster
    kubectl --kubeconfig="${KUBECONFIG}" get namespaces

    # Get the cluster CA certificate directly from describe-cluster result
    KUBE_CA_CERT=$(echo "${cluster_info}" | jq -r '.cluster.certificateAuthority.data' | base64 --decode)
    echo "KUBE_CA_CERT: ${KUBE_CA_CERT}"
    echo "--------------------------------------------------------------------"
}

# Read accounts from input file and assume role for each account
if [ ! -f "${INPUT_FILE}" ]; then
    echo "Input file not found: ${INPUT_FILE}"
    exit 1
fi

while IFS=, read -r account_id account_name; do
    assume_role "${account_id}" "${account_name}"
done < "${INPUT_FILE}"

内容的提问来源于stack exchange,提问作者Raghu

火山引擎 最新活动