Java Apache Kafak Study

Posted by huweiyi on Monday, July 26, 2021

TOC

Kafka Brief Intro

Fundamental Kafka Theory

=>:send to

  1. data sources =>topics => partition => offsets
    • data is randomly sent to the partition unless the key is provided
  2. brokers and topics
    • kafka cluster is composed of multiple brokers(servers)
    • broker contains topics' partitions
    • connect to any broker, then connect to whole cluster
  3. topics replication
    • if one broker is down, another broker can serve the data.
    • one broker can be leader for a given partition.
  4. producers and message keys
    • producers can choose to receive acknowledgement of data writes(ack=0 won’t wait ,ack=1 wait for leader,ack=all leader+replica)
    • producer choose to send a key(id) with message(string, number, etc…)
  5. consumers and consumers group
    • consumer read data from topics
    • data read in order from 0 to the end of offsets size.
    • each consumer within a group read from exclusive partitions.
    • More consumers than partitions, some consumers will be inactive.
  6. consumer offsets & Delivery Semantics
    • kafka stores the offsets(announcement of size or something) where a consumer group has been reading.
    • so if consumer die, it can read back from where is left off(committed offset).
    • delivery semantics for consumers:
  7. ZooKeeper
    • zookeeper manages brokers, and it helps in performing leader election for partitions.
    • zookeeper has a leader (handle writes) and the rest of servers are followers (handle reads), kafak metadata in zookeeper.

CLI

  • zookeeper-server-start config/zookeeper.properties

  • kafka-server-start config/server.properties

  • kafka-topics –zookeeper 0.0.0.0:2181 –topic first_topic –create –partitions 3 –replication-factor 1

  • kafka-topics –zookeeper 0.0.0.0:2181 –list

  • kafka-topics –zookeeper 0.0.0.0:2181 –topic first_topic –describe

  • kafka-topics –zookeeper 0.0.0.0:2181 –topic second_topic –delete

  • kafka-console-producer –broker-list 0.0.0.0:9092 –topic first_topic

  • kafka-console-producer –broker-list 0.0.0.0:9092 –topic first_topic –producer-property acks=all

  • kafka-console-consumer –bootstrap-server 0.0.0.0:9092 –topic first_topic

  • kafka-console-consumer –bootstrap-server 0.0.0.0:9092 –topic first_topic –from-beginning

  • kafka-console-consumer –bootstrap-server 0.0.0.0:9092 –topic first_topic –group my-first-application

  • kafka-consumer-groups –bootstrap-server localhost:9092 –list

  • kafka-consumer-groups –bootstrap-server localhost:9092 –describe –group my-first-application

  • kafka-consumer-groups –bootstrap-server localhost:9092 –group my-first-applicatio –reset-offsets –to-earliest –execute –topic first_topic

  • kafka-consumer-groups –bootstrap-server localhost:9092 –group my-first-applicatio –reset-offsets –shift-by -2 –execute –topic first_topic

  • The CLI has many options, but here are the other that are most commonly used:

    • Producer with keys
        kafka-console-producer --broker-list 127.0.0.1:9092 --topic first_topic --property parse.key=true --property key.separator=,
        "> key,value"
        "> another key,another value"
    
    • Consumer with keys
        kafka-console-consumer --bootstrap-server 127.0.0.1:9092 --topic first_topic --from-beginning --property print.key=true --property key.separator=,
    

Java Kafka 101

Projects

Elastic-Search with Twitter source