Key Concepts of Stream Processing

Stream processing is similar to any data processing where you read the data ,apply some transformation and then push it somewhere. However there are 3 key concepts that are unique to stream processing Time Time is the most important concept in stream processing and therefore it is important to have a common understanding of time … Continue reading Key Concepts of Stream Processing

Kafka Streams

Kafka Streams is a client library (API) to build applications that analyze and process data stored in Kafka in real-time. Streams application takes input from a Kafka topic and stores the output also in a Kafka topic and a stream is an ordered, replay-able, and fault-tolerant sequence of immutable data records, where a data record is defined … Continue reading Kafka Streams

Kafka & Reliability

Apache Kafka guarantees the following with respect to Reliability Kafka does not guarantee ordering of messages between partitions. It does provide ordering within a partition.  Order of messages in a partition will be ordered. If message B was written after message A, using the same producer and to the same partition, then Kafka guarantees that … Continue reading Kafka & Reliability

Kafka Producer API Internals

The Kafka Producer API allows applications to send messages to the Kafka cluster. Producer APIs are simple to use, but this post will talk about what goes on under the hood of the producer when we send data Refer this code for a working example of sending messages to a Kafka topic named "time-series". Example … Continue reading Kafka Producer API Internals

How to consume custom objects from Kafka Topic ?

In the last post i shared an example on how to send custom java object to Kafka. This blog will share details on how to consume the data from Kafka Topic. We will consume the customer object that we sent to Kafka topic in the last post.Below is a depiction of the use case to … Continue reading How to consume custom objects from Kafka Topic ?

How to send custom objects to Kafka

This blog will share an example on how to send custom java object or data to Kafka Producer Use-caseRead a data file with customer detailsMap the data to a customer java objectSend the data to a Kafka Topic To build this use case , we will do following stepsCreate a Kafka topicBuild a json schema … Continue reading How to send custom objects to Kafka

Kubernetes Vs Docker

One of the common question that I encounter in many discussions is - Should I use Docker or Kubernetes? Both of them are leading technology wrt containers but comparing Docker and Kubernetes is wrong, as they are used for different reason and this blog tries to explain what are they are used for and why is it … Continue reading Kubernetes Vs Docker

Kafka Storage Architecture

In this article we will check how Kafka stores and organizes the data. Kafka records are organized and stored in a Topic. Producer applications write data to a Topic and consumer applications read data from a Topic. Topics are similar to the folders in a file system and the messages that are sent by producers … Continue reading Kafka Storage Architecture

Apache Kafka & ZooKeeper

Apache ZooKeeper is an open source volunteer project under the Apache Software Foundation. It is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and group services. These services are used in some form or another by distributed applications. In Kafka also it acts like a centralized service that manages cluster memberships of … Continue reading Apache Kafka & ZooKeeper

How to create a consumer group and consume messages from a Multi Node Kafka Cluster in Local?

This example requires running a multi-node kafka cluster in local. For instructions on how to do this, please refer to my previous article. In this article we will send an insurance file with 36635 rows to Kafka with 3 clusters and have a group of 2 consumers consume this data. You can download the file … Continue reading How to create a consumer group and consume messages from a Multi Node Kafka Cluster in Local?