docker-hadoopインストール手順

 

1. pull docker image
sudo docker pull kiwenlau/hadoop:1.0
2. clone github repository
git clone https://github.com/kiwenlau/hadoop-cluster-docker
3. create hadoop network
sudo docker network create --driver=bridge hadoop
4. start container and update hadoop-master,hadoop-slave1,hadoop-slave2's /etc/hosts
vi /etc/hosts
hadoop-master www.xxx.yyy.aaa
hadoop-slave1 www.xxx.yyy.bbb
hadoop-slave2 www.xxx.yyy.ccc

cd hadoop-cluster-docker sudo ./start-container.sh

output:

start hadoop-master container...
start hadoop-slave1 container...
start hadoop-slave2 container...
root@hadoop-master:~# 
  • start 3 containers with 1 master and 2 slaves
  • you will get into the /root directory of hadoop-master container
5. start hadoop
./start-hadoop.sh
6. run wordcount
./run-wordcount.sh

output

input file1.txt:
Hello Hadoop

input file2.txt:
Hello Docker

wordcount output:
Docker    1
Hadoop    1
Hello    2
 rebuild docker image
sudo ./resize-cluster.sh 4
  • specify parameter > 1: 2, 3..
  • this script just rebuild hadoop image with different slaves file, which pecifies the name of all slave nodes
start container
sudo ./start-container.sh 4