Skip to content

Commit

Permalink
[#526] fix(CI): Expose the HDFS DataNode data transfer port in the gr…
Browse files Browse the repository at this point in the history
…avitino-ci-hive Docker (#527)

### What changes were proposed in this pull request?
- add hostname map to 127.0.0.1 instead of the IP automatically assigned
by the Docker
 - expose DataNode data transfer port for external access
 - fix tag bug additionally

### Why are the changes needed?
Hive IT needs to access HDFS

Fix: #526 

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
Tested locally and later on Hive IT
  • Loading branch information
mchades authored Oct 17, 2023
1 parent f463883 commit 4977c0e
Show file tree
Hide file tree
Showing 5 changed files with 26 additions and 5 deletions.
2 changes: 1 addition & 1 deletion dev/docker/hive/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ RUN rm -rf /tmp/packages

################################################################################
# expose port
EXPOSE 22 3306 8088 9000 9083 10000 10002 50070 50075
EXPOSE 22 3306 8088 9000 9083 10000 10002 50070 50075 50010

################################################################################
# create startup script and set ENTRYPOINT
Expand Down
10 changes: 8 additions & 2 deletions dev/docker/hive/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ It includes Hadoop-2.x and Hive-2.x, you can use this Docker image to test the G

## Run container
```
docker run --rm -d -p 8022:22 -p 8088:8088 -p 9000:9000 -p 9083:9083 -p 10000:10000 -p 10002:10002 -p 50070:50070 -p 50075:50075 datastrato/gravitino-ci-hive
docker run --rm -d -p 8022:22 -p 8088:8088 -p 9000:9000 -p 9083:9083 -p 10000:10000 -p 10002:10002 -p 50070:50070 -p 50075:50075 -p 50010:50010 datastrato/gravitino-ci-hive
```

## Login Docker container
Expand All @@ -33,7 +33,8 @@ ssh -p 8022 datastrato@localhost (password: ds123, this is a sudo user)
- `22` SSH
- `9000` HDFS defaultFS
- `50070` HDFS NameNode
- `50075` HDFS DataNode
- `50075` HDFS DataNode http server
- `50010` HDFS DataNode data transfer
- `8088` YARN Service
- `9083` Hive Metastore
- `10000` HiveServer2
Expand All @@ -53,3 +54,8 @@ ssh -p 8022 datastrato@localhost (password: ds123, this is a sudo user)
- change mysql bind-address from `127.0.0.1` to `0.0.0.0`
- add `iceberg` to mysql users with password `iceberg`
- export `3306` port for mysql

### 0.1.4
- Config HDFS DataNode data transfer address to `0.0.0.0:50010` explicitly
- Map container hostname to `127.0.0.1` before starting Hadoop
- Expose `50010` port for the HDFS DataNode
8 changes: 6 additions & 2 deletions dev/docker/hive/build-docker.sh
Original file line number Diff line number Diff line change
Expand Up @@ -83,8 +83,12 @@ if [[ "${platform_type}" == "all" ]]; then
if [[ "${tag_name}" == "" ]]; then
docker buildx build --platform=linux/amd64,linux/arm64 --build-arg HADOOP_PACKAGE_NAME=${HADOOP_PACKAGE_NAME} --build-arg HIVE_PACKAGE_NAME=${HIVE_PACKAGE_NAME} --push --progress plain -f Dockerfile -t ${image_name} .
else
docker buildx build --platform=linux/amd64,linux/arm64 --build-arg HADOOP_PACKAGE_NAME=${HADOOP_PACKAGE_NAME} --build-arg HIVE_PACKAGE_NAME=${HIVE_PACKAGE_NAME} --push --tag ${tag_name} --progress plain -f Dockerfile -t ${image_name} .
docker buildx build --platform=linux/amd64,linux/arm64 --build-arg HADOOP_PACKAGE_NAME=${HADOOP_PACKAGE_NAME} --build-arg HIVE_PACKAGE_NAME=${HIVE_PACKAGE_NAME} --push --progress plain -f Dockerfile -t ${image_name}:${tag_name} .
fi
else
docker buildx build --platform=${platform_type} --build-arg HADOOP_PACKAGE_NAME=${HADOOP_PACKAGE_NAME} --build-arg HIVE_PACKAGE_NAME=${HIVE_PACKAGE_NAME} --output type=docker --progress plain -f Dockerfile -t ${image_name} .
if [[ "${tag_name}" == "" ]]; then
docker buildx build --platform=${platform_type} --build-arg HADOOP_PACKAGE_NAME=${HADOOP_PACKAGE_NAME} --build-arg HIVE_PACKAGE_NAME=${HIVE_PACKAGE_NAME} --output type=docker --progress plain -f Dockerfile -t ${image_name} .
else
docker buildx build --platform=${platform_type} --build-arg HADOOP_PACKAGE_NAME=${HADOOP_PACKAGE_NAME} --build-arg HIVE_PACKAGE_NAME=${HIVE_PACKAGE_NAME} --output type=docker --progress plain -f Dockerfile -t ${image_name}:${tag_name} .
fi
fi
5 changes: 5 additions & 0 deletions dev/docker/hive/hdfs-site.xml
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,9 @@
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>

<property>
<name>dfs.datanode.address</name>
<value>0.0.0.0:50010</value>
</property>
</configuration>
6 changes: 6 additions & 0 deletions dev/docker/hive/start.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@ service ssh start
ssh-keyscan localhost > /root/.ssh/known_hosts
ssh-keyscan 0.0.0.0 >> /root/.ssh/known_hosts

# Map the hostname to 127.0.0.1 for external access datanode
hostname=$(cat /etc/hostname)
new_content=$(cat /etc/hosts | sed "/$hostname/s/^/# /")
new_content="${new_content}\n127.0.0.1 ${hostname}"
echo -e "$new_content" > /etc/hosts

# start hadoop
${HADOOP_HOME}/sbin/start-all.sh

Expand Down

0 comments on commit 4977c0e

Please sign in to comment.