[HUDI-3602][DOCS] Update docker README to build multi-arch images using buildx (#5011)
This commit is contained in:
@@ -91,3 +91,97 @@ You can find more information on [Docker Hub Repositories Manual](https://docs.d
|
||||
## Docker Demo Setup
|
||||
|
||||
Please refer to the [Docker Demo Docs page](https://hudi.apache.org/docs/docker_demo).
|
||||
|
||||
## Building Multi-Arch Images
|
||||
|
||||
NOTE: The steps below require some code changes. Support for multi-arch builds in a fully automated manner is being
|
||||
tracked by [HUDI-3601](https://issues.apache.org/jira/browse/HUDI-3601).
|
||||
|
||||
By default, the docker images are built for x86_64 (amd64) architecture. Docker `buildx` allows you to build multi-arch
|
||||
images, link them together with a manifest file, and push them all to a registry – with a single command. Let's say we
|
||||
want to build for arm64 architecture. First we need to ensure that `buildx` setup is done locally. Please follow the
|
||||
below steps (referred from https://www.docker.com/blog/multi-arch-images):
|
||||
|
||||
```
|
||||
# List builders
|
||||
~ ❯❯❯ docker buildx ls
|
||||
NAME/NODE DRIVER/ENDPOINT STATUS PLATFORMS
|
||||
default * docker
|
||||
default default running linux/amd64, linux/arm64, linux/arm/v7, linux/arm/v6
|
||||
|
||||
# If you are using the default builder, which is basically the old builder, then do following
|
||||
~ ❯❯❯ docker buildx create --name mybuilder
|
||||
mybuilder
|
||||
~ ❯❯❯ docker buildx use mybuilder
|
||||
~ ❯❯❯ docker buildx inspect --bootstrap
|
||||
[+] Building 2.5s (1/1) FINISHED
|
||||
=> [internal] booting buildkit 2.5s
|
||||
=> => pulling image moby/buildkit:master 1.3s
|
||||
=> => creating container buildx_buildkit_mybuilder0 1.2s
|
||||
Name: mybuilder
|
||||
Driver: docker-container
|
||||
|
||||
Nodes:
|
||||
Name: mybuilder0
|
||||
Endpoint: unix:///var/run/docker.sock
|
||||
Status: running
|
||||
|
||||
Platforms: linux/amd64, linux/arm64, linux/arm/v7, linux/arm/v6
|
||||
```
|
||||
|
||||
Now goto `<HUDI_REPO_DIR>/docker/hoodie/hadoop` and change the `Dockerfile` to pull dependent images corresponding to
|
||||
arm64. For example, in [base/Dockerfile](./hoodie/hadoop/base/Dockerfile) (which pulls jdk8 image), change the
|
||||
line `FROM openjdk:8u212-jdk-slim-stretch` to `FROM arm64v8/openjdk:8u212-jdk-slim-stretch`.
|
||||
|
||||
Then, from under `<HUDI_REPO_DIR>/docker/hoodie/hadoop` directory, execute the following command to build as well as
|
||||
push the image to the dockerhub repo:
|
||||
|
||||
```
|
||||
# Run under hoodie/hadoop, the <tag> is optional, "latest" by default
|
||||
docker buildx build <image_folder_name> --platform <comma-separated,platforms> -t <hub-user>/<repo-name>[:<tag>] --push
|
||||
|
||||
# For example, to build base image
|
||||
docker buildx build base --platform linux/arm64 -t apachehudi/hudi-hadoop_2.8.4-base:linux-arm64-0.10.1 --push
|
||||
```
|
||||
|
||||
Once the base image is pushed then you could do something similar for other images.
|
||||
Change [hive](./hoodie/hadoop/hive_base/Dockerfile) dockerfile to pull the base image with tag corresponding to
|
||||
linux/arm64 platform.
|
||||
|
||||
```
|
||||
# Change below line in the Dockerfile
|
||||
FROM apachehudi/hudi-hadoop_${HADOOP_VERSION}-base:latest
|
||||
# as shown below
|
||||
FROM --platform=linux/arm64 apachehudi/hudi-hadoop_${HADOOP_VERSION}-base:linux-arm64-0.10.1
|
||||
|
||||
# and then build & push from under hoodie/hadoop dir
|
||||
docker buildx build hive_base --platform linux/arm64 -t apachehudi/hudi-hadoop_2.8.4-hive_2.3.3:linux-arm64-0.10.1 --push
|
||||
```
|
||||
|
||||
Similarly, for images that are dependent on hive (e.g. [base spark](./hoodie/hadoop/spark_base/Dockerfile)
|
||||
, [sparkmaster](./hoodie/hadoop/sparkmaster/Dockerfile), [sparkworker](./hoodie/hadoop/sparkworker/Dockerfile)
|
||||
and [sparkadhoc](./hoodie/hadoop/sparkadhoc/Dockerfile)), change the corresponding Dockerfile to pull the base hive
|
||||
image with tag corresponding to arm64. Then build and push using `docker buildx` command.
|
||||
|
||||
For the sake of completeness, here is a [patch](https://gist.github.com/xushiyan/cec16585e884cf0693250631a1d10ec2) which
|
||||
shows what changes to make in Dockerfiles (assuming tag is named `linux-arm64-0.10.1`), and below is the list
|
||||
of `docker buildx` commands.
|
||||
|
||||
```
|
||||
docker buildx build base --platform linux/arm64 -t apachehudi/hudi-hadoop_2.8.4-base:linux-arm64-0.10.1 --push
|
||||
docker buildx build datanode --platform linux/arm64 -t apachehudi/hudi-hadoop_2.8.4-datanode:linux-arm64-0.10.1 --push
|
||||
docker buildx build historyserver --platform linux/arm64 -t apachehudi/hudi-hadoop_2.8.4-history:linux-arm64-0.10.1 --push
|
||||
docker buildx build hive_base --platform linux/arm64 -t apachehudi/hudi-hadoop_2.8.4-hive_2.3.3:linux-arm64-0.10.1 --push
|
||||
docker buildx build namenode --platform linux/arm64 -t apachehudi/hudi-hadoop_2.8.4-namenode:linux-arm64-0.10.1 --push
|
||||
docker buildx build prestobase --platform linux/arm64 -t apachehudi/hudi-hadoop_2.8.4-prestobase_0.217:linux-arm64-0.10.1 --push
|
||||
docker buildx build spark_base --platform linux/arm64 -t apachehudi/hudi-hadoop_2.8.4-hive_2.3.3-sparkbase_2.4.4:linux-arm64-0.10.1 --push
|
||||
docker buildx build sparkadhoc --platform linux/arm64 -t apachehudi/hudi-hadoop_2.8.4-hive_2.3.3-sparkadhoc_2.4.4:linux-arm64-0.10.1 --push
|
||||
docker buildx build sparkmaster --platform linux/arm64 -t apachehudi/hudi-hadoop_2.8.4-hive_2.3.3-sparkmaster_2.4.4:linux-arm64-0.10.1 --push
|
||||
docker buildx build sparkworker --platform linux/arm64 -t apachehudi/hudi-hadoop_2.8.4-hive_2.3.3-sparkworker_2.4.4:linux-arm64-0.10.1 --push
|
||||
```
|
||||
|
||||
Once all the required images are pushed to the dockerhub repos, then we need to do one additional change
|
||||
in [docker compose](./compose/docker-compose_hadoop284_hive233_spark244.yml) file.
|
||||
Apply [this patch](https://gist.github.com/codope/3dd986de5e54f0650dd74b6032e4456c) to the docker compose file so
|
||||
that [setup_demo](./setup_demo.sh) pulls images with the correct tag for arm64. And now we should be ready to run the
|
||||
setup script and follow the docker demo.
|
||||
|
||||
Reference in New Issue
Block a user