Skip to main content

Replicating Data Between Two Kafka Clusters

Hands-On Lab

 

Photo of Chad Crowell

Chad Crowell

DevOps Training Architect II in Content

Length

01:45:00

Difficulty

Intermediate

Kafka can be deployed in mutliple data centers in an "Active-Active", "Active-Passive", or centralized architecture. In this hands-on lab, we simulate an active-passive architecture in which data from one cluster is replicated to another cluster using a tool called Replicator. Replicator, like MirrorMaker, allows you to preserve topic configuration in the source cluster while replicating the messages from one cluster to another.

What are Hands-On Labs?

Hands-On Labs are scenario-based learning environments where learners can practice without consequences. Don't compromise a system or waste money on expensive downloads. Practice real-world skills without the real-world risk, no assembly required.

Replicating Data Between Two Kafka Clusters

Introduction

In this hands-on lab, we need to create two localized Kafka clusters. These clusters will contain only one zookeeper and kafka instance each. We differentiate between the two by specifying different port numbers and data directories for each cluster. Once both clusters are up and running, we create a topic and replicate that topic to the secondary cluster. We continue to produce messages to the source cluster and ensure that the messages are successfully mirrored.

Solution

  1. Begin by logging in to the lab server using the credentials provided on the hands-on lab page.

    ssh cloud_user@PUBLIC_IP_ADDRESS

Set Up the Environment

  1. Install Java.

    sudo apt install -y default-jdk
  2. Verify the Java installation.

    java -version
  3. Download the Kafka binaries.

    sudo curl -O https://packages.confluent.io/archive/5.2/confluent-5.2.1-2.12.tar.gz
  4. Expand the downloaded file.

    tar -xvf confluent-5.2.1-2.12.tar.gz
  5. Rename the new folder and enter it.

    mv confluent-5.2.1 confluent
    
    cd confluent/

Start the Destination Cluster

  1. Start Zookeeper.

    bin/zookeeper-server-start etc/kafka/zookeeper.properties
  2. Open a new terminal and connect to the server.

    ssh cloud_user@PUBLIC_IP_ADDRESS
  3. Start up Kafka.

    cd confluent/
    
    bin/kafka-server-start etc/kafka/server.properties

Start the Origin Cluster

  1. Open a new terminal and connect to the server.

    ssh cloud_user@PUBLIC_IP_ADDRESS
  2. Copy the properties files.

    cd confluent/
    
    cp etc/kafka/zookeeper.properties /tmp/zookeeper_origin.properties
    
    cp etc/kafka/server.properties /tmp/server_origin.properties
  3. Update the properties files.

    sed -i '' -e "s/2181/2171/g" /tmp/zookeeper_origin.properties
    
    sed -i '' -e "s/9092/9082/g" /tmp/server_origin.properties
    
    sed -i '' -e "s/2181/2171/g" /tmp/server_origin.properties
    
    sed -i '' -e "s/#listen/listen/g" /tmp/server_origin.properties
    
    sed -i '' -e "s/zookeeper/zookeeper_origin/g" /tmp/zookeeper_origin.properties
    
    sed -i '' -e "s/kafka-logs/kafka-logs-origin/g" /tmp/server_origin.properties
  4. Verify the changes in the Zookeeper file.

    vim /tmp/zookeeper_origin.properties
  5. Exit the editor.

  6. Verify the changes in the server file.

    vim /tmp/server_origin.properties
  7. Exit the editor.

  8. Start the Zookeeper.

    bin/zookeeper-server-start /tmp/zookeeper_origin.properties
  9. Open a new terminal and connect to the server.

    ssh cloud_user@PUBLIC_IP_ADDRESS
  1. Start up Kafka.

    cd confluent/
    
    bin/kafka-server-start /tmp/server_origin.properties

Create a Topic in the Source Cluster

  1. Open a new terminal and connect to the server.

    ssh cloud_user@PUBLIC_IP_ADDRESS
  2. Transfer directories.

    cd confluent/
  3. Create the topic.

    bin/kafka-topics --create --topic test-topic --replication-factor 1 --partitions 1 --zookeeper localhost:2171

Run the Replicator Tool

  1. Run the replicator.

    bin/connect-standalone etc/kafka/connect-standalone.properties etc/kafka-connect-replicator/quickstart-replicator.properties

Verify Replication

  1. Open a new terminal and connect to the server.

    ssh cloud_user@PUBLIC_IP_ADDRESS
  2. Transfer directories.

    cd confluent/
  3. Examine the replicator properties.

    vim etc/kafka-connect-replicator/quickstart-replicator.properties
  4. Verify the renaming convention.

  5. Exit the editor.

  6. Use Kafka to examine the file and verify it.

    bin/kafka-topics --describe --topic test-topic.replica --zookeeper localhost:2181
  7. Write to the source topic.

    seq 10000 | bin/kafka-console-producer --topic test-topic --broker-list localhost:9082
  8. Confirm messages were written to the destination cluster.

    bin/kafka-console-consumer --from-beginning --topic test-topic.replica --bootstrap-server localhost:9092

Conclusion

Congratulations — you've completed this hands-on lab!