Hadoop集群中Hive Testbench TPC-H/TPC-DS数据生成失败求助
Let's break down the issues you're facing and walk through actionable troubleshooting steps:
Key Observations from Your Logs
- Both TPC-H and TPC-DS MapReduce/Tez jobs report "completed successfully" but have 0 counters, which is highly abnormal—this means no actual data was processed or written.
- The target HDFS temp directories remain empty despite job completion.
- TPC-DS fails during the table optimization step for
store_salesandstore_returns, which is a downstream effect of missing raw data.
Step 1: Verify Temp Directory Paths (Local vs HDFS)
The most common root cause here is a mismatch between the script's expected output location and where the data is actually being written:
- Open
tpch-setup.shandtpcds-setup.sh, look for theGENERATE_DIRvariable.- If it's set to a local path like
/tmp/tpch-generate, that's a problem—MapReduce/Tez tasks run on cluster nodes, so local data won't be accessible across the cluster or visible in your HDFS temp directory. - Update it to an HDFS path, e.g.:
GENERATE_DIR="hdfs://192.168.10.15:8020/tmp/tpch-generate"
- If it's set to a local path like
- Check if local temp directories on your cluster nodes have any generated data:
If you see files here, the script is generating data locally but failing to upload it to HDFS. Look forssh <cluster-node> ls /tmp/tpch-generate/10/hdfs dfs -putcommands in the setup scripts and verify they're executing without errors.
Step 2: Fix HDFS Permissions
Your user (rapids) needs full read/write access to the temp directories:
- Create and set permissions for the TPC-H temp directory:
hdfs dfs -mkdir -p /tmp/tpch-generate/10 hdfs dfs -chmod -R 777 /tmp/tpch-generate - Repeat the same for TPC-DS:
hdfs dfs -mkdir -p /tmp/tpcds-generate/10 hdfs dfs -chmod -R 777 /tmp/tpcds-generate - Test your user's HDFS access:
If this fails, you need to adjust HDFS ACLs or add your user to the appropriate group.hdfs dfs -touchz /tmp/test_rapids && hdfs dfs -rm /tmp/test_rapids
Step 3: Inspect Map Task Logs for Hidden Errors
Even though the job says "completed successfully", individual map tasks might have silently failed or skipped data processing. Here's how to check:
- Open the YARN WebUI at
http://boray05:8088(from your logs). - Locate the relevant job (e.g.,
job_1514226810133_0050for TPC-H). - Click into the job, then view the logs for each map task. Look for:
- Errors related to input file not found
- Permission denied when writing output
- Issues with Tez container initialization
Step 4: Validate Tez Configuration Compatibility
Your Tez setup looks mostly correct, but let's confirm a few critical points:
- Ensure Hive is configured to use Tez:
Checkhive-site.xmlfor this property, or run:
It should returnhive -e "set hive.execution.engine;"tez. - Verify the Tez lib path exists in HDFS:
You should see Tez jar files here. If not, re-deploy Tez to HDFS.hdfs dfs -ls hdfs://192.168.10.15:8020/apps/tez/ - Resolve SLF4J binding conflicts (optional but improves logging):
Your logs showSLF4J: Class path contains multiple SLF4J bindings. This can cause log suppression—remove conflicting SLF4J jars from Hive/Tez/Hadoop lib directories (keep one consistent version, e.g., slf4j-api-1.7.25.jar).
Step 5: Check Version Compatibility
While your stack should be compatible (Hadoop 2.9.0 supports Hive 2.3.0 and Tez 0.9.0), double-check:
- Tez 0.9.0 requires Hadoop 2.7 or later (you're on 2.9.0, which is fine)
- Hive 2.3.0's Tez integration is stable with Tez 0.9.0
If you've gone through all these steps and still have issues, share the map task logs from YARN—they'll give us the specific error that's being hidden by the "success" job status.
内容的提问来源于stack exchange,提问作者robert




