Kafka Brief Intro
Fundamental Kafka Theory
- data sources =>topics => partition => offsets
- data is randomly sent to the partition unless the key is provided
- brokers and topics
- kafka cluster is composed of multiple brokers(servers)
- broker contains topics' partitions
- connect to any broker, then connect to whole cluster
- topics replication
- if one broker is down, another broker can serve the data.
- one broker can be leader for a given partition.
- producers and message keys
- producers can choose to receive acknowledgement of data writes(ack=0 won’t wait ,ack=1 wait for leader,ack=all leader+replica)
- producer choose to send a key(id) with message(string, number, etc…)
- consumers and consumers group
- consumer read data from topics
- data read in order from 0 to the end of offsets size.
- each consumer within a group read from exclusive partitions.
- More consumers than partitions, some consumers will be inactive.
- consumer offsets & Delivery Semantics
- kafka stores the offsets(announcement of size or something) where a consumer group has been reading.
- so if consumer die, it can read back from where is left off(committed offset).
- delivery semantics for consumers:
- zookeeper manages brokers, and it helps in performing leader election for partitions.
- zookeeper has a leader (handle writes) and the rest of servers are followers (handle reads), kafak metadata in zookeeper.
kafka-topics –zookeeper 0.0.0.0:2181 –topic first_topic –create –partitions 3 –replication-factor 1
kafka-topics –zookeeper 0.0.0.0:2181 –list
kafka-topics –zookeeper 0.0.0.0:2181 –topic first_topic –describe
kafka-topics –zookeeper 0.0.0.0:2181 –topic second_topic –delete
kafka-console-producer –broker-list 0.0.0.0:9092 –topic first_topic
kafka-console-producer –broker-list 0.0.0.0:9092 –topic first_topic –producer-property acks=all
kafka-console-consumer –bootstrap-server 0.0.0.0:9092 –topic first_topic
kafka-console-consumer –bootstrap-server 0.0.0.0:9092 –topic first_topic –from-beginning
kafka-console-consumer –bootstrap-server 0.0.0.0:9092 –topic first_topic –group my-first-application
kafka-consumer-groups –bootstrap-server localhost:9092 –list
kafka-consumer-groups –bootstrap-server localhost:9092 –describe –group my-first-application
kafka-consumer-groups –bootstrap-server localhost:9092 –group my-first-applicatio –reset-offsets –to-earliest –execute –topic first_topic
kafka-consumer-groups –bootstrap-server localhost:9092 –group my-first-applicatio –reset-offsets –shift-by -2 –execute –topic first_topic
The CLI has many options, but here are the other that are most commonly used:
- Producer with keys
kafka-console-producer --broker-list 127.0.0.1:9092 --topic first_topic --property parse.key=true --property key.separator=, "> key,value" "> another key,another value"
- Consumer with keys
kafka-console-consumer --bootstrap-server 127.0.0.1:9092 --topic first_topic --from-beginning --property print.key=true --property key.separator=,