[Tech Simplified] Kafka

Introduction to Kafka

LiveRunGrow
3 min readSep 15

--

Photo by Colton Duke on Unsplash

Kafka is a distributed system consisting of servers and clients that communicate via TCP network protocol.

  • To publish (write) and subscribe to (read) streams of events from sources like databases, sensors, mobile devices, cloud services, and software applications
  • Storing these event streams durably for a defined period
  • Processing event streams in real-time as well as retrospectively
  • Routing the event streams to different destination technologies as needed.

Terminology

Event/Message:

  • A single record that is being transmitted in Kafka.

Producer:

  • Sources that produces Events or Messages to Kafka

Consumers:

  • Those that subscribe and React to the Messages.

Consumer groups:

  • Several consumers are grouped to consume a given topic. Consumers in the same consumer group are assigned the same group-id value. This concept ensures that a message is only read by a single consumer in the group.

Topics:

  • Messages are categorised and stored in topics.
  • Think of each of them as a category.
  • A topic can contain messages produced by different producers and also consumed by different consumer groups.
  • Topics are partitioned, which means they are located in different kafka brokers.

Visualisation

Imagine Kafka as a distributed queue.

Kafka is made up of multiple queues and each queue is located in a different server. Each queue represents a partition and contains a subset of messages owned by a single topic.

A scenario

Imagine we have many ongoing Basketball matches at the same time and we want to publish their results to a dashboard for fans to view the scores of the matches in real time.

--

--

LiveRunGrow

π“†‰οΈŽ π™³πš›πšŽπšŠπš–πšŽπš› πŸͺ΄π™²πš›πšŽπšŠπšπš˜πš› πŸ‘©β€πŸ’»πš‚πš˜πšπšπš πšŠπš›πšŽ πšŽπš—πšπš’πš—πšŽπšŽπš› ☻ I write & reflect weekly about software engineering, my life and books. Ŧ๏ɭɭ๏ฬ ΰΉ“Ρ”!