Practice setting up a kafka cluster on AWS EC2 Part 2:

Tram Ho

In the previous article, we have finished setting up the zookeeper cluster including 3 servers running on 3 different AZs, this article will continue to create 3 more children on the remaining 3 AZs to create a boostrap server cluster.

Kafka Architecture

Briefly, Kafka uses Zookeeper to send messages to servers (also known as brokers) through topics. These topics will have a replication factor and will be distributed across the broker. In this lab, our model will look like this:

image.png

It can be seen that we have 3 brokers, with a setup like this, if in a situation where 1 broker is shut down, our data at that time is still preserved, thanks to kafka’s backup mechanism (N -1 brokers can be down, if N is your default topic replication factor). Data is evenly distributed across the cluster, which means less memory is used per broker while preserving data availability.

Configure Kafka

To configure Kafka, there are many settings for it, the appropriate setup depends on your infrastructure. This article mainly uses the default settings provided by Kafka. You can read the Kafka configuration file here . Notice we indicated the data backup path and indicated the zookeeper cluster installed in the previous post

Setup Kafka

First, I will setup with only 1 server running kafka. First, you create an instance with the AIM that was created in the previous post, named My Kafka 1 , with the previous post I created 3 Zookeeper servers on 3 different subnets, with this instance you choose 1 of the following: the remaining subnet, in my example, I set the private ip for it as 192.168.1.119 (note this ip address is the ip address that you set in the previous post /etc/hosts at first)

image.png

SSH into the instance, here you already have the /kafka directory with all the settings already set up. To run Kafka first look at the settings file for it here . There are many configuration parameters, I want to focus on the following important parameters:

For each Kafka server, a separate id broker.id=1 is required

log.dirs=/data/kafka specifies which log data should be saved

min.insync.replicas=2 because we only have 1 kafka server, so please change it to 1 (edit it with 2 at the bottom of the article)

zookeeper.connect=zookeeper1:2181,zookeeper2:2181,zookeeper3:2181/kafka this is the connection string Zookeeper has created, notice we add /kafka at the end so that Zookeeper saves all kafka cluster data in one directory separately named kafka for easy management

The next step is to mount this setting file to replace the default setting file

Try running bin/kafka-server-start.sh config/server.properties .

image.png

Coming here like the previous article, I will turn it into a service to run in the background, the script file has the content here

Then we can shutdown and start as a service

To determine if the service is running, use the command nc -vz localhost 9092

image.png

We have finished setting up a Kafka 1 node cluster, let’s run a few commands to see if the cluster works well

We create a topic on the Kafka server

On one of the Zookeepers, run the command bin/kafka-topics.sh --list --zookeeper localhost:2181 , the results show that the topic created above has been saved by Zookeeper

image.png

So I have completed the setup of a Kafka cluster consisting of 1 node. Next, just create AIM from this instance and create 2 more instances on the remaining 2 subnets like the previous post and that’s it (notice the config file, you edit broker.id for 2 instances, respectively 2, 3) In the article In the next writing, I will perform the operations of sending and receiving data on this cluster !!!

Share the news now

Source : Viblo