EMR 5.8.0集群YARN配置修改及同步、重启相关问询
Great question! Let's break this down step by step for your EMR 5.8.0 cluster:
1. Do I only need to modify the yarn-site.xml on the Hadoop master node?
Short answer: No, not exactly. While you’ll make the initial edit on the master node (where the ResourceManager runs), the yarn-site.xml file needs to have identical configurations on all core and task nodes (where NodeManagers operate) for YARN to function consistently.
That said, manual config edits aren’t the most durable approach for EMR. Amazon EMR manages cluster configurations centrally, so manual changes might get overwritten if the cluster is restarted or scaled. For long-term, persistent changes, it’s better to use EMR’s built-in tools—like specifying configs during cluster creation or using the aws emr modify-cluster API. But if you need a quick, one-off tweak, the manual steps below will work.
2. How to sync config changes to data nodes?
EMR clusters come with pdsh (parallel shell) pre-installed, which makes running commands across multiple nodes a breeze. Here’s how to sync your updated yarn-site.xml:
- First, edit the file on the master node using your preferred editor:
sudo nano /etc/hadoop/conf/yarn-site.xml - Save your changes, then run this command on the master to copy the updated file to all core and task nodes (replace
<MASTER_PRIVATE_IP>with your master node’s private IP—get it withhostname -i):pdsh -g core,task 'sudo scp <MASTER_PRIVATE_IP>:/etc/hadoop/conf/yarn-site.xml /etc/hadoop/conf/' - Double-check the sync worked by SSHing into any core node and running
cat /etc/hadoop/conf/yarn-site.xmlto confirm your changes are present.
3. Can I just restart YARN using standard steps?
Yes, but you need to restart both the ResourceManager (on the master) and all NodeManagers (on core/task nodes) to apply the changes fully. Here’s the proper workflow:
- On the master node, restart the ResourceManager:
sudo stop yarn-resourcemanager && sudo start yarn-resourcemanager - Use
pdshto restart NodeManagers across all core and task nodes in parallel:pdsh -g core,task 'sudo stop yarn-nodemanager && sudo start yarn-nodemanager' - Verify services are running:
- On the master:
sudo status yarn-resourcemanager - On all core/task nodes (via parallel check):
pdsh -g core,task 'sudo status yarn-nodemanager'
- On the master:
Pro tip: If you have running jobs, drain them first by putting the ResourceManager into maintenance mode to avoid unexpected interruptions.
内容的提问来源于stack exchange,提问作者femibyte




