What is Kafka?
This asynchronous API is implemented using the Apache Kafka protocol. It is a publish-subscribe events platform designed to be fault-tolerant, providing a high-throughput and low-latency platform for handling real-time data feeds.
Kafka runs as a cluster of one or more servers (Kafka brokers). The load is balanced across the cluster by distributing it amongst the servers.
Key concepts
Topic
A stream of messages is stored in categories called topics. Topics are represented as channels in an AsyncAPI document. Each topic comprises one or more partitions. Each partition is an ordered list of messages. The messages on a partition are each given a monotonically increasing number called the offset.
Message/Record
The unit of data in Kafka which comprises two parts: headers and value. Headers are commonly used for data about the message and the value is the body of the message.
Producer
A process that publishes streams of messages to Kafka topics. A producer can publish to one or more topics and can optionally choose the partition that stores the data.
Consumer
A process that consumes messages from Kafka topics and processes the feed of messages. A consumer can consume from one or more topics or partitions.
Consumer group
A named group of one or more consumers that together consume the messages from a set of topics. Each consumer in the group reads messages from specific partitions that it is assigned to. Each partition is assigned to one consumer in the group only. Assignment is controlled by the 'client.id' and 'group.id' Consumer properties.
Consuming from this API
You will require a Kafka client to connect and subscribe. Kafka client libraries are available for a variety of programming languages, and the sample code for each channel is a starting point for using a Java Kafka client to consume events using a Kafka Consumer.