Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#526] fix(CI): Expose the HDFS DataNode data transfer port in the gravitino-ci-hive Docker #527

Merged
merged 2 commits into from
Oct 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion dev/docker/hive/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ RUN rm -rf /tmp/packages

################################################################################
# expose port
EXPOSE 22 3306 8088 9000 9083 10000 10002 50070 50075
EXPOSE 22 3306 8088 9000 9083 10000 10002 50070 50075 50010

################################################################################
# create startup script and set ENTRYPOINT
Expand Down
10 changes: 8 additions & 2 deletions dev/docker/hive/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ It includes Hadoop-2.x and Hive-2.x, you can use this Docker image to test the G

## Run container
```
docker run --rm -d -p 8022:22 -p 8088:8088 -p 9000:9000 -p 9083:9083 -p 10000:10000 -p 10002:10002 -p 50070:50070 -p 50075:50075 datastrato/gravitino-ci-hive
docker run --rm -d -p 8022:22 -p 8088:8088 -p 9000:9000 -p 9083:9083 -p 10000:10000 -p 10002:10002 -p 50070:50070 -p 50075:50075 -p 50010:50010 datastrato/gravitino-ci-hive
```

## Login Docker container
Expand All @@ -33,7 +33,8 @@ ssh -p 8022 datastrato@localhost (password: ds123, this is a sudo user)
- `22` SSH
- `9000` HDFS defaultFS
- `50070` HDFS NameNode
- `50075` HDFS DataNode
- `50075` HDFS DataNode http server
- `50010` HDFS DataNode data transfer
- `8088` YARN Service
- `9083` Hive Metastore
- `10000` HiveServer2
Expand All @@ -53,3 +54,8 @@ ssh -p 8022 datastrato@localhost (password: ds123, this is a sudo user)
- change mysql bind-address from `127.0.0.1` to `0.0.0.0`
- add `iceberg` to mysql users with password `iceberg`
- export `3306` port for mysql

### 0.1.4
- Config HDFS DataNode data transfer address to `0.0.0.0:50010` explicitly
- Map container hostname to `127.0.0.1` before starting Hadoop
- Expose `50010` port for the HDFS DataNode
8 changes: 6 additions & 2 deletions dev/docker/hive/build-docker.sh
Original file line number Diff line number Diff line change
Expand Up @@ -83,8 +83,12 @@ if [[ "${platform_type}" == "all" ]]; then
if [[ "${tag_name}" == "" ]]; then
docker buildx build --platform=linux/amd64,linux/arm64 --build-arg HADOOP_PACKAGE_NAME=${HADOOP_PACKAGE_NAME} --build-arg HIVE_PACKAGE_NAME=${HIVE_PACKAGE_NAME} --push --progress plain -f Dockerfile -t ${image_name} .
else
docker buildx build --platform=linux/amd64,linux/arm64 --build-arg HADOOP_PACKAGE_NAME=${HADOOP_PACKAGE_NAME} --build-arg HIVE_PACKAGE_NAME=${HIVE_PACKAGE_NAME} --push --tag ${tag_name} --progress plain -f Dockerfile -t ${image_name} .
docker buildx build --platform=linux/amd64,linux/arm64 --build-arg HADOOP_PACKAGE_NAME=${HADOOP_PACKAGE_NAME} --build-arg HIVE_PACKAGE_NAME=${HIVE_PACKAGE_NAME} --push --progress plain -f Dockerfile -t ${image_name}:${tag_name} .
fi
else
docker buildx build --platform=${platform_type} --build-arg HADOOP_PACKAGE_NAME=${HADOOP_PACKAGE_NAME} --build-arg HIVE_PACKAGE_NAME=${HIVE_PACKAGE_NAME} --output type=docker --progress plain -f Dockerfile -t ${image_name} .
if [[ "${tag_name}" == "" ]]; then
docker buildx build --platform=${platform_type} --build-arg HADOOP_PACKAGE_NAME=${HADOOP_PACKAGE_NAME} --build-arg HIVE_PACKAGE_NAME=${HIVE_PACKAGE_NAME} --output type=docker --progress plain -f Dockerfile -t ${image_name} .
else
docker buildx build --platform=${platform_type} --build-arg HADOOP_PACKAGE_NAME=${HADOOP_PACKAGE_NAME} --build-arg HIVE_PACKAGE_NAME=${HIVE_PACKAGE_NAME} --output type=docker --progress plain -f Dockerfile -t ${image_name}:${tag_name} .
fi
fi
5 changes: 5 additions & 0 deletions dev/docker/hive/hdfs-site.xml
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,9 @@
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>

<property>
<name>dfs.datanode.address</name>
<value>0.0.0.0:50010</value>
</property>
</configuration>
6 changes: 6 additions & 0 deletions dev/docker/hive/start.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@ service ssh start
ssh-keyscan localhost > /root/.ssh/known_hosts
ssh-keyscan 0.0.0.0 >> /root/.ssh/known_hosts

# Map the hostname to 127.0.0.1 for external access datanode
hostname=$(cat /etc/hostname)
new_content=$(cat /etc/hosts | sed "/$hostname/s/^/# /")
new_content="${new_content}\n127.0.0.1 ${hostname}"
echo -e "$new_content" > /etc/hosts

# start hadoop
${HADOOP_HOME}/sbin/start-all.sh

Expand Down