您好,登錄后才能下訂單哦!
【1】搭建HA高可用hadoop-2.3(規劃+環境準備)
【2】搭建HA高可用hadoop-2.3(安裝zookeeper)
【3】搭建HA高可用hadoop-2.3(部署配置hadoop--cdh6.1.0)
【4】搭建HA高可用hadoop-2.3(部署配置HBase)
安裝部署hadoop
(1)安裝hadoop
master1、master2、slave1、slave2、slave3
#cd /opt/ #tar xf hadoop-2.3.0-cdh6.1.0.tar.gz #ln -s ln -s hadoop-2.3.0-cdh6.1.0 hadoop
(2)添加hadoop環境變量
master1、master2、slave1、slave2、slave3
#cat >> /etc/profile <<EOF export HADOOP_HOME=/opt/hadoop export PATH=$PATH:$HADOOP_HOME/bin EOF #source /etc/profile
(3)配置hadoop
主要配置文件 (hadoop-2.3.0-cdh6.1.0 /etc/hadoop/) | 格式 | 作用 |
hadoop-env.sh | bash腳本 | hadoop需要的環境變量 |
core-site.xml | xml | hadoop的core的配置項 |
hdfs-site.xml | xml | hdfs的守護進程配置,包括namenode、datanode |
slaves | 純文本 | datanode的節點列表(每行一個) |
mapred-env.sh | bash腳本 | mapreduce需要的環境變量 |
mapre-site.xml | xml | mapreduce的守護進程配置 |
yarn-env.sh | bash腳本 | yarn需要的環境變量 |
yarn-site.xml | xml | yarn的配置項 |
以下1-8的配置,所有機器都相同,可先配置一臺,將配置統一copy到另外幾臺機器。
master1、master2、slave1、slave2、slave3
1:配置hadoop-env.sh
cat >> hadoop-env.sh <<EOF export JAVA_HOME=/usr/java/jdk1.8.0_60 export HADOOP_HOME=/opt/hadoop-2.3.0-cdh6.1.0 EOF
2:配置core-site.xml
#mkdir -p /data/hadoop/tmp #vim core-site.xml <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <!--填寫hdfs集群名,因為是HA,兩個namenode--> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <property> <!-- hadoop很多路徑都依賴他,namenode節點該目錄不可以刪除,否則要重新格式化--> <name>hadoop.tmp.dir</name> <value>/data/hadoop/tmp</value> </property> <property> <!--zookeeper集群的地址--> <name>ha.zookeeper.quorum</name> <value>master1:2181,master2:2181,slave1:2181,slave2:2181,slave3:2181</value> </property> </configuration>
3:配置hdfs-site.xml
#mkdir -p /data/hadoop/dfs/{namenode,datanode} #mkdir -p /data/hadoop/ha/journal #vim hdfs-site.xml <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!--hdfs-site.xml--> <configuration> <property> <!--設置為true,否則一些命令無法使用如:webhdfs的LISTSTATUS--> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <property> <!--數據三副本--> <name>dfs.replication</name> <value>3</value> </property> <property> <!--namenode的數據目錄,存儲集群元數據--> <name>dfs.namenode.name.dir</name> <value>file:/data/hadoop/dfs/namenode</value> </property> <property> <!--datenode的數據目錄--> <name>dfs.datanode.data.dir</name> <value>file:/data/hadoop/dfs/datanode</value> </property> <property> <!--可選,關閉權限帶來一些不必要的麻煩--> <name>dfs.permissions</name> <value>false</value> </property> <property> <!--可選,關閉權限帶來一些不必要的麻煩--> <name>dfs.permissions.enabled</name> <value>false</value> </property> <!--HA配置--> <property> <!--設置集群的邏輯名--> <name>dfs.nameservices</name> <value>mycluster</value> </property> <property> <!--hdfs集群中的namenode節點邏輯名--> <name>dfs.ha.namenodes.mycluster</name> <value>namenode1,namenode2</value> </property> <property> <!--hdfs namenode邏輯名中RPC配置,rpc簡單理解為序列化文件上傳輸出文件要用到--> <name>dfs.namenode.rpc-address.mycluster.namenode1</name> <value>master1:9000</value> </property> <property> <!--hdfs namenode邏輯名中RPC配置,rpc簡單理解為序列化文件上傳輸出文件要用到--> <name>dfs.namenode.rpc-address.mycluster.namenode2</name> <value>master2:9000</value> </property> <property> <!--配置hadoop頁面訪問端口--> <name>dfs.namenode.http-address.mycluster.namenode1</name> <value>master1:50070</value> </property> <property> <name>dfs.namenode.http-address.mycluster.namenode2</name> <value>master2:50070</value> </property> <property> <!--建立與namenode的通信--> <name>dfs.namenode.servicerpc-address.mycluster.namenode1</name> <value>master1:53310</value> </property> <property> <name>dfs.namenode.servicerpc-address.mycluster.namenode2</name> <value>master2:53310</value> </property> <property> <!--journalnode 共享文件集群--> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://master1:8485;master2:8485;slave1:8485;slave2:8485;slave3:8485/mycluster</value> </property> <property> <!--journalnode對namenode的進行共享設置--> <name>dfs.journalnode.edits.dir</name> <value>/data/hadoop/ha/journal</value> </property> <property> <!--設置故障處理類--> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <!--開啟自動切換,namenode1 stanby后nn2或active--> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <property> <!--zookeeper集群的地址--> <name>ha.zookeeper.quorum</name> <value>master1:2181,master2:2181,slave1:2181,slave2:2181,slave3:2181</value> </property> <property> <!--使用ssh方式進行故障切換--> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <!--ssh通信密碼通信位置--> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/root/.ssh/id_rsa</value> </property> </configuration>
4:配置mapred-env.sh
cat >> mapred-env.sh <<EOF #heqinqin configure export JAVA_HOME=/usr/java/jdk1.8.0_60 EOF
5:配置mapred-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <name>mapreduce.framework.name</name> <value>yarn<value> </configuration>
6:配置yarn-env.sh
cat >> yarn-env.sh <<EOF #heqinqin configure export JAVA_HOME=/usr/java/jdk1.8.0_60 EOF
7:配置yarn-site.xml
#mkdir -p /data/hadoop/yarn/local #mkdir -p /data/hadoop/logs #chown -R hadoop /data/hadoop #vim yarn-site.xml <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!--####################yarn-site.xml#########################--> <configuration> <property> <!--rm失聯后重新鏈接的時間--> <name>yarn.resourcemanager.connect.retry-interval.ms</name> <value>2000</value> </property> <property> <!--開啟resource manager HA,默認為false--> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <property> <!--開啟故障自動切換--> <name>yarn.resourcemanager.ha.automatic-failover.enabled</name> <value>true</value> </property> <property> <!--配置resource manager --> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <property> <name>yarn.resourcemanager.ha.id</name> <value>rm1</value> <description>If we want to launch more than one RM in single node, we need this configuration</description> </property> <property> <!--開啟自動恢復功能--> <name>yarn.resourcemanager.recovery.enabled</name> <value>true</value> </property> <property> <!--配置與zookeeper的連接地址--> <name>yarn.resourcemanager.zk-state-store.address</name> <value>localhost:2181</value> </property> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> </property> <property> <name>yarn.resourcemanager.zk-address</name> <value>localhost:2181</value> </property> <property> <name>yarn.resourcemanager.cluster-id</name> <value>yarncluster</value> </property> <property> <!--schelduler失聯等待連接時間--> <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name> <value>5000</value> </property> <!--配置resourcemanager--> <!--配置rm1--> <property> <!--配置應用管理端口--> <name>yarn.resourcemanager.address.rm1</name> <value>master1:8032</value> </property> <property> <!--scheduler調度器組建的ipc端口--> <name>yarn.resourcemanager.scheduler.address.rm1</name> <value>master1:8030</value> </property> <property> <!--http服務端口--> <name>yarn.resourcemanager.webapp.address.rm1</name> <value>master1:8088</value> </property> <property> <!--IPC端口--> <name>yarn.resourcemanager.resource-tracker.address.rm1</name> <value>master1:8031</value> </property> <property> <!--IPC端口--> <name>yarn.resourcemanager.admin.address.rm1</name> <value>master1:8033</value> </property> <property> <name>yarn.resourcemanager.ha.admin.address.rm1</name> <value>master1:8035</value> </property> <!--配置rm2--> <property> <!--application 管理端口--> <name>yarn.resourcemanager.address.rm2</name> <value>master2:8032</value> </property> <property> <!--scheduler調度器端口--> <name>yarn.resourcemanager.scheduler.address.rm2</name> <value>master2:8030</value> </property> <property> <!--http服務端口--> <name>yarn.resourcemanager.webapp.address.rm2</name> <value>master2:8088</value> </property> <property> <!--ipc端口--> <name>yarn.resourcemanager.resource-tracker.address.rm2</name> <value>master2:8031</value> </property> <property> <!--ipc端口--> <name>yarn.resourcemanager.admin.address.rm2</name> <value>master2:8033</value> </property> <property> <name>yarn.resourcemanager.ha.admin.address.rm2</name> <value>master2:8035</value> </property> <!--配置nodemanager--> <property> <!--配置localizer ipc端口--> <description>Address where the localizer IPC is.</description> <name>yarn.nodemanager.localizer.address</name> <value>0.0.0.0:8040</value> </property> <property> <!--nodemanager http訪問端口--> <description>NM Webapp address.</description> <name>yarn.nodemanager.webapp.address</name> <value>0.0.0.0:8042</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.nodemanager.local-dirs</name> <value>/data/hadoop/yarn/local</value> </property> <property> <name>yarn.nodemanager.log-dirs</name> <value>/data/hadoop/logs</value> </property> <property> <name>mapreduce.shuffle.port</name> <value>8050</value> </property> <!--故障處理類--> <property> <name>yarn.client.failover-proxy-provider</name> <value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value> </property> </configuration>
8:配置slaves
cat >> slaves <<EOF slave1 slave2 slave3 EOF
配置完畢
啟動集群
(1)格式化命名空間
master1
#/opt/hadoop/bin/hdfs zkfc -formatZK
(2)啟動journalnode
master1、master2、slave1、slave2、slave3 (集群內隨意算則奇數臺機器作為journalnode,三臺也可以)
#/opt/hadoop/sbin/hadoop-daemon.sh start journalnode
(3)master1節點格式化,并啟動namenode
master1
格式化namenode的目錄
#/opt/hadoop/bin/hadoop namenode -format mycluster
啟動namenode
#/opt/hadoop/sbin/hadoop-daemon.sh start namenode
(4)master2節點同步master1的格式化目錄,并啟動namenode
master2
從master1將格式化的目錄同步過來
#/opt/hadoop/bin/hdfs namenode -bootstrapStandby
啟動namenode
#/opt/hadoop/sbin/hadoop-daemon.sh start namenode
(5)master節點啟動zkfs
master1、master2
#/opt/hadoop/sbin/hadoop-daemon.sh start zkfc
(6)slave節點啟動datanode
slave1、slave2、slave3
#/opt/hadoop/sbin/hadoop-daemon.sh start datanode
(7)master節點啟動yarn
master1
#/opt/hadoop/sbin/start-yarn.sh
(8)master節點啟動historyserver
master1
./mr-jobhistory-daemon.sh start historyserver
集群已啟動。在各服務器執行jps查看,兩個master上各一個namenode,形成namenode高可用,實現故障自動切換。
【1】搭建HA高可用hadoop-2.3(規劃+環境準備)
【2】搭建HA高可用hadoop-2.3(安裝zookeeper)
【3】搭建HA高可用hadoop-2.3(部署配置hadoop--cdh6.1.0)
【4】搭建HA高可用hadoop-2.3(部署配置HBase)
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。