Apache Kafka & ZooKeeper

  • Apache ZooKeeper is an open source volunteer project under the Apache Software Foundation.
  • It is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and group services. These services are used in some form or another by distributed applications.
  • In Kafka also it acts like a centralized service that manages cluster memberships of brokers, producers, and consumers, their relevant configurations, cluster registry services. and helps in leader election for a Kafka topic.
  • Zookeeper configuration is defined in the broker configuration or server.properties.
zookeeper.connect=localhost:2181
  • When Kafka broker starts, it reads the configuration and connects with zookeeper and creates an ephemeral node to zookeeper and this represents an active broker session.
  • An ephemeral zNode is a node that will disappear when the session of its owner ends.
  •  These ephemeral nodes exists as long as the session that created the node is active. When the session ends the ephemeral node is deleted. 
  • Zookeeper thus maintains the list of all active brokers . We can check this by using zookeeper shell command
zookeeper-shell localhost:2181
  • i have a 3 node kafka cluster running now and when i give the below command , i get 1,2,3 as output
ls /brokers/ids
 [1, 2, 3]
  • Now if i shut down one of the broker and try , i get the below output
ls /brokers/ids
 [1, 2]
  • This is how Zookeeper maintains the list of active brokers in a cluster .
  • In a Kafka cluster, one of the brokers serves as the controller, which is responsible for managing the states of partitions and replicas and for performing administrative tasks like reassigning partitions and electing new leaders on broker failure.
  • Controller does all these additional works and acts like a regular broker too. There will always be only 1 controller in Kafka cluster at a given point of time.
  • When the Kafka brokers start, the first broker that starts in the cluster becomes the controller by creating an ephemeral controller node in the zookeeper. Other brokers start and they too try to register but gets an exception because the controller is already elected.
  • When a controller dies , all other brokers try to register it as an ephemeral controller node in the zookeeper but only 1 succeeds and others get an exception.
  • To check the controller, you can use zookeeper shell. In my case i have a 3 node cluster running and here is the output. Broker id 1 is elected as the controller.
get /controller
 {"version":1,"brokerid":1,"timestamp":"1606077046008"}
  • Zookeeper maintains all the Kafka cluster information and of the broker acts as the controller and does other activities too.

Thanks.

One thought on “Apache Kafka & ZooKeeper

Leave a Reply