博客 HADOOP之YARN详解（下）

HADOOP之YARN详解（下）

数栈君发表于 2024-12-04 14:27 575 0

五 Yarn的命令

yarn top

类似于Linux的top命令，查看正在运行的程序资源占用情况。
yarn queue -status root.default

查看指定队列使用情况，下文会讲解任务队列
yarn application

-list

#通过任务的状态，列举YARN的任务，使用-appStates指定状态
#任务状态：ALL、NEW、NEW_SAVING、SUBMITTED、ACCEPTED、RUNNING、FINISHED、FAILED、KILLED
#e.g.
#查看所有正在运行的任务
yarn application -list -appStates RUNNING
#查看所有的失败的任务
yarn application -list -appStates FAILED

-movetoqueue

#将一个任务移动到指定的队列中
yarn application -movetoqueue application_xxxxxx_xxx -queue root.small

-kill

#杀死指定的任务
yarn application -kill application_xxxxxx_xxx

yarn container

-list

#查看正在执行的任务的容器信息
yarn container -list application_xxxxxxxxxx_xxx

-status

#查看指定容器信息
yarn container -status container_xxxxx

yarn jar

#提交任务到YARN执行
yarn jar $HADOOP_HOME/share/hadoop/mapreduce-examples-3.3.1.jar /input /output

yarn logs

#查看yarn的程序运行时的日志信息
yarnlogs -applicationId application_xxxxxxxxxx_xxx

yarn node -all -list

查看所有节点信息

六 Yarn的三种调度器

什么是Scheduler（调度器）

Scheduler即调度器，队列等限制条件(如每个队列分配一定的资源，最多执行一定数量的作业等)，将系统中的资源分配给各个正在运行的应用程序。

YARN提供的三种内置调度器

FIFO Scheduler （FIFO调度器）

如下图所示，只有当job1全部执行完毕，才能开始执行job2

2. Capacity Scheduler (容量调度器)

如图所示，专门留了一部分资源给小任务，可以在执行job1的同时，不会阻塞job2的执行，但是因为这部分资源是一直保留给其他任务的，所以就算只有一个任务，也无法为其分配全部资源，只能让这部分保留资源闲置着，有着一定的资源浪费问题。

3. Fair Scheduler (公平调度器)

公平调度器的目的就是为所有运行的应用公平分配资源，使用公平调度超时，不需要预留一定量的资源，因为调度器会在所有运行的作业之间动态平衡资源，第一个(大)作业启动时，它也是唯一运行的作业。因而获得集群中的所有资源，当第二个(小)作业启动时，它被分配到集群的一半资源，这样每个作业都能公平共享资源。

如图所示，就像是把好几个任务拼接成了一个任务，可以充分利用资源，同时又不会因为大任务在前面执行而导致小任务一直无法完成。

七 YARN的队列配置

YARN默认采用的调度器是容量调度，且默认只有一个任务队列。该调度器内单个队列的调度策略为FIFO，因此在单个队列中的任务并行度为1。那么就会出现单个任务阻塞的情况，如果随着业务的增长，充分的利用到集群的使用率，我们就需要手动的配置多条任务队列。

1. 配置任务队列

默认YARN只有一个default任务队列，现在我们添加一个small的任务队列。

修改配置文件: $HADOOP_HOME/etc/hadoop/capacity-scheduler.xml

<configuration>
    <!-- 不需要修改 -->
    <!-- 容量调度器中最多容纳多少个Job -->
    <property>
        <name>yarn.scheduler.capacity.maximum-applications</name>
        <value>10000</value>
        <description>
            Maximum number of applications that can be pending and running.
        </description>
    </property>

    <!-- 不需要修改 -->
    <!-- MRAppMaster进程所占的资源可以占用队列总资源的百分比，可以通过修改这个参数来限制队列中提交Job的数量 -->
    <property>
        <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
        <value>0.1</value>
        <description>
            Maximum percent of resources in the cluster which can be used to run
            application masters i.e. controls number of concurrent running
            applications.
        </description>
    </property>

    <!-- 不需要修改 -->
    <!-- 为Job分配资源的时候，使用什么策略 -->
    <property>
        <name>yarn.scheduler.capacity.resource-calculator</name>
        <value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value>
        <description>
            The ResourceCalculator implementation to be used to compare
            Resources in the scheduler.
            The default i.e. DefaultResourceCalculator only uses Memory while
            DominantResourceCalculator uses dominant-resource to compare
            multi-dimensional resources such as Memory, CPU etc.
        </description>
    </property>

    <!-- 修改!!! -->
    <!-- 调度器中有什么队列，我们添加一个small队列 -->
    <property>
        <name>yarn.scheduler.capacity.root.queues</name>
        <value>default,small</value>
        <description>
            The queues at the this level (root is the root queue).
        </description>
    </property>

    <!-- 修改!!! -->
    <!-- 配置default队列的容量百分比 -->
    <property>
        <name>yarn.scheduler.capacity.root.default.capacity</name>
        <value>70</value>
        <description>Default queue target capacity.</description>
    </property>

    <!-- 新增!!! -->
    <!-- 新增small队列的容量百分比 -->
    <!-- 所有的队列容量百分比和需要是100 -->
    <property>
        <name>yarn.scheduler.capacity.root.small.capacity</name>
        <value>30</value>
        <description>Default queue target capacity.</description>
    </property>

    <!-- 不需要修改 -->
    <!-- default队列用户能使用的容量最大百分比 -->
    <property>
        <name>yarn.scheduler.capacity.root.default.user-limit-factor</name>
        <value>1</value>
        <description>
            Default queue user limit a percentage from 0.0 to 1.0.
        </description>
    </property>

    <!-- 添加!!! -->
    <!-- small队列用户能使用的容量最大百分比 -->
    <property>
        <name>yarn.scheduler.capacity.root.small.user-limit-factor</name>
        <value>1</value>
        <description>
            Default queue user limit a percentage from 0.0 to 1.0.
        </description>
    </property>

    <!-- 不需要修改 -->
    <!-- default队列能使用的容量最大百分比 -->
    <property>
        <name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
        <value>100</value>
        <description>
            The maximum capacity of the default queue. 
        </description>
    </property>

    <!-- 添加!!! -->
    <!-- small队列能使用的容量最大百分比 -->
    <property>
        <name>yarn.scheduler.capacity.root.small.maximum-capacity</name>
        <value>100</value>
        <description>
            The maximum capacity of the default queue. 
        </description>
    </property>

    <!-- 不需要修改 -->
    <!-- default队列的状态 -->
    <property>
        <name>yarn.scheduler.capacity.root.default.state</name>
        <value>RUNNING</value>
        <description>
            The state of the default queue. State can be one of RUNNING or STOPPED.
        </description>
    </property>

    <!-- 添加!!! -->
    <!-- small队列的状态 -->
    <property>
        <name>yarn.scheduler.capacity.root.small.state</name>
        <value>RUNNING</value>
        <description>
            The state of the default queue. State can be one of RUNNING or STOPPED.
        </description>
    </property>

    <!-- 不需要修改 -->
    <!-- 限制向队列提交的用户-->
    <property>
        <name>yarn.scheduler.capacity.root.default.acl_submit_applications</name>
        <value>*</value>
        <description>
            The ACL of who can submit jobs to the default queue.
        </description>
    </property>
    <!-- 添加!!! -->
    <property>
        <name>yarn.scheduler.capacity.root.small.acl_submit_applications</name>
        <value>*</value>
        <description>
            The ACL of who can submit jobs to the default queue.
        </description>
    </property>

    <!-- 不需要修改 -->
    <property>
        <name>yarn.scheduler.capacity.root.default.acl_administer_queue</name>
        <value>*</value>
        <description>
            The ACL of who can administer jobs on the default queue.
        </description>
    </property>
    <!-- 添加!!! -->
    <property>
        <name>yarn.scheduler.capacity.root.small.acl_administer_queue</name>
        <value>*</value>
        <description>
            The ACL of who can administer jobs on the default queue.
        </description>
    </property>


    <!-- 不需要修改 -->
    <property>
        <name>yarn.scheduler.capacity.node-locality-delay</name>
        <value>40</value>
        <description>
            Number of missed scheduling opportunities after which the CapacityScheduler 
            attempts to schedule rack-local containers. 
            Typically this should be set to number of nodes in the cluster, By default is setting 
            approximately number of nodes in one rack which is 40.
        </description>
    </property>
    <!-- 不需要修改 -->
    <property>
        <name>yarn.scheduler.capacity.queue-mappings</name>
        <value></value>
        <description>
            A list of mappings that will be used to assign jobs to queues
            The syntax for this list is [u|g]:[name]:[queue_name][,next mapping]*
            Typically this list will be used to map users to queues,
            for example, u:%user:%user maps all users to queues with the same name
            as the user.
        </description>
    </property>
    <!-- 不需要修改 -->
    <property>
        <name>yarn.scheduler.capacity.queue-mappings-override.enable</name>
        <value>false</value>
        <description>
            If a queue mapping is present, will it override the value specified
            by the user? This can be used by administrators to place jobs in queues
            that are different than the one specified by the user.
            The default is false.
        </description>
    </property>
</configuration>

分发到hadoopnode1和hadoopnode2节点

scp capacity-scheduler.xml hadoopnode1:$PWD
scp capacity-scheduler.xml hadoopnode2:$PWD

重启yarn服务或刷新

stop-yarn.sh
start-yarn.sh
#或
yarn rmadmin -refreshQueues

刷新yarn的web页面可以看到

指定提交队列

-Dmapreduce.job.queuename=small

如：

hadoop jar /usr/local/hadoop-3.3.1/share/hadoop//mapreduce/hadoop-mapreduce-examples-3.3.1.jar wordcount -Dmapreduce.job.queuename=small /input /output22

hadoop jar /usr/local/hadoop-3.3.1/share/hadoop//mapreduce/hadoop-mapreduce-examples-3.3.1.jar pi  -Dmapreduce.job.queuename=small 50 50

如不用-Dmapreduce.job.queuename指定队列，默认队列是 default .

2. 默认队列设置

YARN默认将任务提交到default队列，如果需要提交到其他的队列中，可以使用 -Dmapreduce.job.queuename指定提交的队列，也可以设置默认的任务提交队列。

<!-- 配置默认的提交队列 -->
<property>
    <name>mapreduce.job.queuename</name>
    <value>small</value>
</property>

修改mapred-site.xml文件，不需要重启，直接提交任务，自动使用指定的队列去执行。

八 YARN的Node Label机制

1. Node Label的介绍

官网对NodeLabel的介绍如下：

Node Label is a way to group nodes with similar characteristics and applications can specify where to run. 节点标签是一种对具有相似特征的节点进行分组的方法，应用程序可以指定在哪里运行。

2. 开启标签

修改 yarn-site.xml文件，添加如下配置:

<!-- 启用节点标签 -->
<property>
    <name>yarn.node-labels.enabled</name>
    <value>true</value>
</property>

<!-- 节点标签存储的路径，可以是HDFS，也可以是本地文件系统 -->
<!-- 如果是本地文件系统，使用类似 file://home/yarn/node-label这样的路径--&gt;
<!-- 无论是HDFS，还是本地文件系统，需要保证RM有权限去访问 -->
<property>
    <name>yarn.node-labels.fs-store.root-dir</name>
    <value>hdfs://hadoopnode1:9820/tmp/yarn/node-labels</value>
</property>

<!-- 保持默认即可，也可以不配置这个选项 -->
<property>
    <name>yarn.node-labels.configuration-type</name>
    <value>centralized</value>
</property>

分发到各各节点，并重启 yarn 服务.

3. 标签管理

1. 添加标签

yarn rmadmin -addToClusterNodeLabels "label_1"
yarn rmadmin -addToClusterNodeLabels "label_2,label_3"

2. 查看标签

yarn cluster --list-node-labels

3. 删除标签

yarn rmadmin -removeFromClusterNodeLabels label_1

4. 为节点打上标签

# 绑定一个NodeManager与Label
yarn rmadmin -replaceLabelsOnNode "qianfeng02=label_2"
# 绑定多个NodeManager与Label的关系，中间用空格分隔
yarn rmadmin -replaceLabelsOnNode "qianfeng02=label_2 qianfeng03=label_2"

# 一个标签可以绑定多个NodeManager，一个NodeManager只能绑定一个标签。
# 例如上方的，label_2就绑定在了qianfeng02和qianfeng03的NodeManager上。

# 绑定完成后，可以使用WebUI进行查看。
# 在WebUI的左侧，有Node Labels的查看，可以查看到所有的标签，以及对应的节点信息。
# 需要注意的是，如果某节点没有进行标签的绑定，则其在一个默认的<DEFAULT_PARTITION>上绑定。

5. 为队列绑定标签

通过修改capacity-scheduler.xml实现:

<!-- 前文，我们已经新增了一个队列，现在共有两个队列: default、small -->

<!-- 设置某个队列可以使用的标签，*表示通配，可以使用所有标签 -->
<property>
    <name>yarn.scheduler.capacity.root.default.accessible-node-labels</name>
    <value>*</value>
</property>

<!-- 设置small队列可以使用label_3标签的节点资源 -->
<property>
    <name>yarn.scheduler.capacity.root.small.accessible-node-labels</name>
    <value>label_3</value>
</property>

<!-- 设置default队列可以使用label_2标签的节点资源最多60% -->
<property>
    <name>yarn.scheduler.capacity.root.default.accessible-node-labels.label_2.capacity</name>
    <value>60</value>
</property>

<!-- 设置small队列可以使用label_3标签的节点资源最多80% -->
<property>
    <name>yarn.scheduler.capacity.root.small.accessible-node-labels.label_3.capacity</name>
    <value>80</value>
</property>

<!-- 设置default队列，如果没有明确的标签指向，则默认使用label_3 -->
<property>
    <name>yarn.scheduler.capacity.root.default.default-node-label-expression</name>
    <value>label_3</value>
</property>

修改之后，无需重启，直接刷新一下队列即可: yarn rmadmin -refreshQueues

6. 测试

hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar pi -Dmapreduce.job.queuename=small 10 10

————————————————

本文系转载，版权归原作者所有，如若侵权请联系我们进行删除！

《数据资产管理白皮书》下载地址：

《行业指标体系白皮书》下载地址：

《数据治理行业实践白皮书》下载地址：

《数栈V6.0产品白皮书》下载地址：

想了解或咨询更多有关袋鼠云大数据产品、行业解决方案、客户案例的朋友，浏览袋鼠云官网：

同时，欢迎对大数据开源项目有兴趣的同学加入「袋鼠云开源框架钉钉技术群」，交流最新开源技术信息，群号码：30537511，项目地址：

出海企业数据预测的精准方法出海企业数据监控的实时技术出海企业数据分析的高级应用出海企业数据签名的实现方式出海企业数据验证的有效方法出海企业数据校验的常用工具出海企业数据审计的合规要求出海企业数据挖掘的价值发现大数据技术

0条评论

上一篇：HADOOP之YARN详解（上）

下一篇：数据分析在软件测试中的应用

我要提问

分享经验

社区公告

大数据领域最专业的产品&技术交流社区，专注于探讨与分享大数据领域有趣又火热的信息，专业又专注的数据人园地

最新活动更多