1
0

[HUDI-3905] Add S3 related setup in Kafka Connect quick start (#5356)

This commit is contained in:
Y Ethan Guo
2022-04-19 15:08:28 -07:00
committed by GitHub
parent 81bf771e56
commit 6f3fe880d2

View File

@@ -48,8 +48,8 @@ $CONFLUENT_DIR/bin/confluent-hub install confluentinc/kafka-connect-hdfs:10.1.0
cp -r $CONFLUENT_DIR/share/confluent-hub-components/confluentinc-kafka-connect-hdfs/* /usr/local/share/kafka/plugins/ cp -r $CONFLUENT_DIR/share/confluent-hub-components/confluentinc-kafka-connect-hdfs/* /usr/local/share/kafka/plugins/
``` ```
Now, build the packaged jar that contains all the hudi classes, including the Hudi Kafka Connector. And copy it Now, build the packaged jar that contains all the hudi classes, including the Hudi Kafka Connector. And copy it to the
to the plugin path that contains all the other jars (`/usr/local/share/kafka/plugins/lib`) plugin path that contains all the other jars (`/usr/local/share/kafka/plugins/lib`)
```bash ```bash
cd $HUDI_DIR cd $HUDI_DIR
@@ -58,8 +58,20 @@ mkdir -p /usr/local/share/kafka/plugins/lib
cp $HUDI_DIR/packaging/hudi-kafka-connect-bundle/target/hudi-kafka-connect-bundle-0.11.0-SNAPSHOT.jar /usr/local/share/kafka/plugins/lib cp $HUDI_DIR/packaging/hudi-kafka-connect-bundle/target/hudi-kafka-connect-bundle-0.11.0-SNAPSHOT.jar /usr/local/share/kafka/plugins/lib
``` ```
Set up a Kafka broker locally. Download the latest apache kafka from [here](https://kafka.apache.org/downloads). If the Hudi Sink Connector writes to a target Hudi table on [Amazon S3](https://aws.amazon.com/s3/), you need two
Once downloaded and built, run the Zookeeper server and Kafka server using the command line tools. additional jars, `hadoop-aws-2.10.1.jar` and `aws-java-sdk-bundle-1.11.271.jar`, in the `plugins/lib` folder. You may
download them using the following commands. Note that, when you specify the target table path on S3, you need to use
`s3a://` prefix.
```bash
cd /usr/local/share/kafka/plugins/lib
wget https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/1.11.271/aws-java-sdk-bundle-1.11.271.jar
wget https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.10.1/hadoop-aws-2.10.1.jar
```
Set up a Kafka broker locally. Download the latest apache kafka from [here](https://kafka.apache.org/downloads). Once
downloaded and built, run the Zookeeper server and Kafka server using the command line tools.
```bash ```bash
export KAFKA_HOME=/path/to/kafka_install_dir export KAFKA_HOME=/path/to/kafka_install_dir
cd $KAFKA_HOME cd $KAFKA_HOME
@@ -67,6 +79,7 @@ cd $KAFKA_HOME
./bin/zookeeper-server-start.sh ./config/zookeeper.properties ./bin/zookeeper-server-start.sh ./config/zookeeper.properties
./bin/kafka-server-start.sh ./config/server.properties ./bin/kafka-server-start.sh ./config/server.properties
``` ```
Wait until the kafka cluster is up and running. Wait until the kafka cluster is up and running.
### 2 - Set up the schema registry ### 2 - Set up the schema registry