Kafka
Work through every question currently mapped to this canonical topic.
Kafka 101 3 questions
- What is Kafka?
Answer
kafka.apache.org: "Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications."
In other words, Kafka is a sort of distributed log where you can store events, read them and distribute them to different services and do it in high-scale and real-time.
- What Kafka is used for?
Answer
- Real-time e-commerce
- Banking
- Health Care
- Automotive (traffic alerts, hazard alerts, ...)
- Real-time Fraud Detection
- What is a "Producer" in regards to Kafka?
Answer
An application that publishes data to the Kafka cluster.
Kafka Architecture 2 questions
- What's in a Kafka cluster?
Answer
- Broker: a server with kafka process running on it. Such server has local storage. In a single Kafka clusters there are usually multiple brokers.
- What is the role of ZooKeeper is Kafka?
Answer
In Kafka, Zookeeper is a centralized controller that manages metadata for producers, brokers, and consumers. Zookeeper also:
Tracks which brokers are part of the Kafka clusterDetermines which broker is the leader of a given partition and topic
Performs leader elections
Manages cluster membership of brokers