Practice setting up a kafka cluster on AWS EC2 Part 2:

Sunday, 12/02/2023

Tram Ho

In the previous article, we have finished setting up the zookeeper cluster including 3 servers running on 3 different AZs, this article will continue to create 3 more children on the remaining 3 AZs to create a boostrap server cluster.

Kafka Architecture

Briefly, Kafka uses Zookeeper to send messages to servers (also known as brokers) through topics. These topics will have a replication factor and will be distributed across the broker. In this lab, our model will look like this:

It can be seen that we have 3 brokers, with a setup like this, if in a situation where 1 broker is shut down, our data at that time is still preserved, thanks to kafka’s backup mechanism (N -1 brokers can be down, if N is your default topic replication factor). Data is evenly distributed across the cluster, which means less memory is used per broker while preserving data availability.

Configure Kafka

To configure Kafka, there are many settings for it, the appropriate setup depends on your infrastructure. This article mainly uses the default settings provided by Kafka. You can read the Kafka configuration file here . Notice we indicated the data backup path and indicated the zookeeper cluster installed in the previous post

# The id of the broker. This must be set to a unique integer for each broker.
broker.id=1
...
# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=zookeeper1:2181,zookeeper2:2181,zookeeper3:2181/kafka

# The id of the broker. This must be set to a unique integer for each broker.

broker.id=1

...

# Zookeeper connection string (see zookeeper docs for details).

# This is a comma separated host:port pairs, each corresponding to a zk

# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".

# You can also append an optional chroot string to the urls to specify the

# root directory for all kafka znodes.

zookeeper.connect=zookeeper1:2181,zookeeper2:2181,zookeeper3:2181/kafka

Setup Kafka

First, I will setup with only 1 server running kafka. First, you create an instance with the AIM that was created in the previous post, named My Kafka 1 , with the previous post I created 3 Zookeeper servers on 3 different subnets, with this instance you choose 1 of the following: the remaining subnet, in my example, I set the private ip for it as 192.168.1.119 (note this ip address is the ip address that you set in the previous post /etc/hosts at first)

SSH into the instance, here you already have the /kafka directory with all the settings already set up. To run Kafka first look at the settings file for it here . There are many configuration parameters, I want to focus on the following important parameters:

For each Kafka server, a separate id broker.id=1 is required

log.dirs=/data/kafka specifies which log data should be saved

min.insync.replicas=2 because we only have 1 kafka server, so please change it to 1 (edit it with 2 at the bottom of the article)

zookeeper.connect=zookeeper1:2181,zookeeper2:2181,zookeeper3:2181/kafka this is the connection string Zookeeper has created, notice we add /kafka at the end so that Zookeeper saves all kafka cluster data in one directory separately named kafka for easy management

The next step is to mount this setting file to replace the default setting file

# edit kafka configuration
rm config/server.properties
nano config/server.properties

# edit kafka configuration

rm config/server.properties

nano config/server.properties

Try running bin/kafka-server-start.sh config/server.properties .

Coming here like the previous article, I will turn it into a service to run in the background, the script file has the content here

# Install Kafka boot scripts
sudo nano /etc/init.d/kafka
sudo chmod +x /etc/init.d/kafka
sudo chown root:root /etc/init.d/kafka
# you can safely ignore the warning
sudo update-rc.d kafka defaults

# Install Kafka boot scripts

sudo nano /etc/init.d/kafka

sudo chmod +x /etc/init.d/kafka

sudo chown root:root /etc/init.d/kafka

# you can safely ignore the warning

sudo update-rc.d kafka defaults

Then we can shutdown and start as a service

# start kafka
sudo service kafka start

# start kafka

sudo service kafka start

To determine if the service is running, use the command nc -vz localhost 9092

We have finished setting up a Kafka 1 node cluster, let’s run a few commands to see if the cluster works well

We create a topic on the Kafka server

# create a topic
bin/kafka-topics.sh --zookeeper zookeeper1:2181/kafka --create --topic first_topic --replication-factor 1 --partitions 3

# create a topic

bin/kafka-topics.sh --zookeeper zookeeper1:2181/kafka --create --topic first_topic --replication-factor 1 --partitions 3

On one of the Zookeepers, run the command bin/kafka-topics.sh --list --zookeeper localhost:2181 , the results show that the topic created above has been saved by Zookeeper

So I have completed the setup of a Kafka cluster consisting of 1 node. Next, just create AIM from this instance and create 2 more instances on the remaining 2 subnets like the previous post and that’s it (notice the config file, you edit broker.id for 2 instances, respectively 2, 3) In the article In the next writing, I will perform the operations of sending and receiving data on this cluster !!!

Share the news now

Source : Viblo

Practice setting up a kafka cluster on AWS EC2 Part 2:

Kafka Architecture

Configure Kafka

Setup Kafka

TikTok becomes the second largest social platform in South Africa

The fastest depreciating after 9 months of launch, iPhone 14 Pro Max continues to break the bottom in Vietnam

Beginner's guide to R: Introduction

10 essential SublimeText plugins for JavaScript developers