本文最后更新于 952 天前,其中的信息可能已经有所发展或是发生改变。
官网下载
设备规划
| hadoop-master | hadoop-follower1 | hadoop-follower2 | |
|---|---|---|---|
| HDFS | NameNode | Secondary NameNode | |
| DataNode | DataNode | ||
| YARN | ResourceManager | ||
| NodeManager | NodeManager |
前置条件
安装 JAVA
yum install -y java-1.8.0-openjdk.x86_64
修改主机名
hostnamectl set-hostname hadoop-master
配置域名映射
vi /etc/hosts
# 注意:此处一定要把127.0.0.1的映射注释掉,否则各个节点启动后不能连接。
# 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.232.70 hadoop-master
192.168.232.71 hadoop-follower1
192.168.232.72 hadoop-follower2
关闭防火墙
# 关闭防火墙
systemctl stop firewalld
# 禁止开机启动
systemctl disable firewalld
分布式安装
- 移动hadoop-3.3.4.tar.gz文件到Linux服务器
/home目录下并解压tar -zxvf hadoop-3.3.4.tar.gz - 修改
etc/hadoop下的配置文件vi core-site.xml,Hadoop的核心配置文件<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!-- 指定HDFS的NameNode地址,主要负责存储HDFS元数据,HDFS的主节点 --> <property> <name>fs.defaultFS</name> <value>hdfs://hadoop-master:9000</value> </property> </configuration>vi hdfs-site.xml,HDFS的配置文件<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!-- 设置HDFS备份文件的数量,默认3个 --> <property> <name>dfs.replication</name> <value>3</value> </property> <!-- 指定HDFS的Secondary NameNode地址,定期存储着NameNode的数据,NameNode的辅助节点 --> <property> <name>dfs.namenode.secondary.http-address</name> <value>hadoop-follower2:9868</value> </property> </configuration>vi mapred-site.xml,MapReduce的配置文件<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!-- 指定MapReduce运行在YARN上 --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>vi yarn-site.xml,YARN的配置文件<?xml version="1.0"?> <configuration> <!-- 指定MapReduce的运算方式Shuffle --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- 指定YARN的ResourceManager地址,负责计算资源的管理和分配,YARN的主节点 --> <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop-follower1</value> </property> </configuration>vi hadoop-env.sh,Hadoop环境变量的配置文件# 指定jdk路径 export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.362.b08-1.el7_9.x86_64/jre
启动服务
# 启动守护进程 NameNode,Secondary NameNode,DataNode,ResourceManager,NodeManager
sbin/start-all.sh
# 关闭守护进程 NameNode,Secondary NameNode,DataNode,ResourceManager,NodeManager
sbin/stop-all.sh
HDFS 分布式文件系统
# 启动 NameNode
bin/hdfs --daemon start namenode
# 关闭 NameNode
bin/hdfs --daemon stop namenode
# 启动 Secondary NameNode
bin/hdfs --daemon start secondarynamenode
# 关闭 Secondary NameNode
bin/hdfs --daemon stop secondarynamenode
# 启动 DataNode
bin/hdfs --daemon start datanode
# 关闭 DataNode
bin/hdfs --daemon stop datanode
# 启动守护进程 NameNode,Secondary NameNode,DataNode
sbin/start-dfs.sh
# 关闭守护进程 NameNode,Secondary NameNode,DataNode
sbin/stop-dfs.sh
YARN 资源调度器
# 启动 ResourceManager
bin/yarn --daemon start resourcemanager
# 关闭 ResourceManager
bin/yarn --daemon stop resourcemanager
# 启动 NodeManager
bin/yarn --daemon start nodemanager
# 关闭 NodeManager
bin/yarn --daemon stop nodemanager
# 启动守护进程 ResourceManager,NodeManager
sbin/start-yarn.sh
# 关闭守护进程 ResourceManager,NodeManager
sbin/stop-yarn.sh
