Kubernetes部署Presto集群连接S3报400 Bad Request错误求助
Presto Hive Connector S3 V4 Signature Issue:
AWSBadRequestException When Creating Schema 问题描述
我在Kubernetes上使用Starburst的Presto Operator部署了Presto集群,尝试通过兼容API连接S3存储桶。已经配置了Hive连接器参数,并添加了包含AWS_ACCESS_KEY_ID和AWS_SECRET_ACCESS_KEY的s3-secret。
但执行以下命令时:
CREATE SCHEMA ban WITH (LOCATION = 's3a://path/to/schema');
Hive元数据存储抛出org.apache.hadoop.fs.s3a.AWSBadRequestException错误(状态码400)。查询资料得知这是由于S3仅支持V4签名API,但客户端配置了默认端点,解决方案是设置fs.s3a.endpoint属性,不过该属性在Presto的Hive配置中似乎无效。
环境信息
- Kubernetes版本:1.18
- Presto Operator镜像:
starburstdata/presto-operator:341-e-k8s-0.35
完整错误堆栈
2020-09-29T12:20:34,120 ERROR [pool-7-thread-2] utils.MetaStoreUtils: Got exception: org.apache.hadoop.fs.s3a.AWSBadRequestException doesBucketExist on presto: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: txc7f0218a02574ba59ec91-005f732692; S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692), S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692:400 Bad Request: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: txc7f0218a02574ba59ec91-005f732692; S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692) org.apache.hadoop.fs.s3a.AWSBadRequestException: doesBucketExist on presto: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: txc7f0218a02574ba59ec91-005f732692; S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692), S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692:400 Bad Request: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: txc7f0218a02574ba59ec91-005f732692; S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692) at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:212) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:256) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:231) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:372) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:308) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3303) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:115) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:141) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:147) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at org.apache.hadoop.hive.metastore.Warehouse.determineDatabasePath(Warehouse.java:190) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database_core(HiveMetaStore.java:1265) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:1425) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_232] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_232] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_232] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_232] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at com.sun.proxy.$Proxy26.create_database(Unknown Source) [?:?] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_database.getResult(ThriftHiveMetastore.java:14861) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_database.getResult(ThriftHiveMetastore.java:14845) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:104) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_232] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_232] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232] Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: txc7f0218a02574ba59ec91-005f732692; S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) ~[aws-java-sdk-bundle-1.11.271.jar:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) ~[aws-java-sdk-bundle-1.11.271.jar:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056) ~[aws-java-sdk-bundle-1.11.271.jar:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) ~[aws-java-sdk-bundle-1.11.271.jar:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) ~[aws-java-sdk-bundle-1.11.271.jar:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) ~[aws-java-sdk-bundle-1.11.271.jar:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) ~[aws-java-sdk-bundle-1.11.271.jar:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) ~[aws-java-sdk-bundle-1.11.271.jar:?] at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) ~[aws-java-sdk-bundle-1.11.271.jar:?] at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) ~[aws-java-sdk-bundle-1.11.271.jar:?] at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) ~[aws-java-sdk-bundle-1.11.271.jar:?] at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1337) ~[aws-java-sdk-bundle-1.11.271.jar:?] at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1277) ~[aws-java-sdk-bundle-1.11.271.jar:?] at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$1(S3AFileSystem.java:373) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?] at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?] ... 33 more
解决方案
我之前也碰到过一模一样的问题,其实核心原因是Presto的Hive连接器依赖Hadoop的s3a客户端,部分Hadoop配置参数无法直接在Presto的hive.properties中生效,必须通过指向Hadoop核心配置文件的方式传递。下面是具体的解决步骤:
1. 配置完整的S3签名与端点参数
首先创建一个包含Hadoop core-site.xml配置的ConfigMap,里面需要包含以下关键参数,而不是只加fs.s3a.endpoint:
<?xml version="1.0" encoding="UTF-8"?> <configuration> <property> <name>fs.s3a.endpoint</name> <value>s3.<你的S3区域>.amazonaws.com</value> <!-- 示例:s3.us-west-2.amazonaws.com --> </property> <property> <name>fs.s3a.signing-algorithm</name> <value>S3SignerType</value> <!-- 强制启用V4签名,解决旧签名版本不兼容问题 --> </property> <property> <name>fs.s3a.path.style.access</name> <value>false</value> <!-- AWS S3默认使用虚拟主机风格访问,设为false;如果是自建S3兼容存储可能需要设为true --> </property> <property> <name>fs.s3a.access.key</name> <value>${env:AWS_ACCESS_KEY_ID}</value> <!-- 通过环境变量引用已配置的secret --> </property> <property> <name>fs.s3a.secret.key</name> <value>${env:AWS_SECRET_ACCESS_KEY}</value> </property> </configuration>
2. 通过Presto Operator挂载配置
在你的Presto集群CRD配置中,添加以下配置来挂载这个ConfigMap到所有Presto节点和Hive metastore容器:
spec: coordinator: additionalVolumes: - name: hadoop-config configMap: name: hadoop-core-site additionalVolumeMounts: - name: hadoop-config mountPath: /etc/hadoop/conf workers: additionalVolumes: - name: hadoop-config configMap: name: hadoop-core-site additionalVolumeMounts: - name: hadoop-config mountPath: /etc/hadoop/conf hive: additionalVolumes:




