Install Kafka:
curl https://archive.apache.org/dist/kafka/2.8.0/kafka_2.12-2.8.0.tgz -o kafka_2.12-2.8.0.tgz
Extract:
tar xzf kafka_2.12-2.8.0.tgz
Go to Kafka folder:
cd kafka_2.12-2.8.0
Start zookeeper:
bin/zookeeper-server-start.sh -daemon config/zookeeper.properties
Start kafka:
bin/kafka-server-start.sh -daemon config/server.properties
Create the input topic(input-topic):
bin/kafka-topics.sh --create --topic input-topic --partitions 1 --replication-factor 1 --bootstrap-server localhost:9092
Create the output topic(output-topic):
bin/kafka-topics.sh --create --topic output-topic --partitions 1 --replication-factor 1 --bootstrap-server localhost:9092
Check that all is ok by listing out all the topics created:
bin/kafka-topics.sh --bootstrap-server localhost:9092 --list
Build your Java project
cd ..
mkdir javakafkaproject && cd javakafkaproject
Download project files:
mkdir -p src/main/java
curl https://joelpintomata.com/tutorials/a-to-z-kafka-example-java-code.java -o src/main/java/KafkaStreamsDemo.java
curl https://joelpintomata.com/tutorials/a-to-z-kafka-example-pom.xml -o ./pom.xml
Build the project:
mvn clean package
Run the application:
java -jar target/kafka-streams-demo-1.0-SNAPSHOT-jar-with-dependencies.jar
In a second terminal, start a kafka producer:
cd kafka_2.12-2.8.0
bin/kafka-console-producer.sh --bootstrap-server localhost:9092 \
--topic input-topic \
--property key.separator=":" \
--property parse.key=true
in the above command we set the separator as :
, without it Kafka defaults to a tab
(\t
)
In a third terminal, start a kafka consumer:
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 \
--topic output-topic \
--property print.key=true \
--property key.separator=" total is " \
--property key.deserializer=org.apache.kafka.common.serialization.StringDeserializer \
--property value.deserializer=org.apache.kafka.common.serialization.LongDeserializer
How does this all work together? In the Producer terminal send the following:
bananas:5
oranges:10
bananas:1
In the Consumer terminal you will get:
bananas total is 6
oranges total is 10
In the Producer terminal send the following:
pears:20
oranges:10
In the Consumer terminal you will get:
pears total is 20
oranges total is 20
We can see that the kafka streams application is aggregating all the view counts based on the video title.
Kafka Streams architecture example
Credits to Introduction to Kafka Streams - Akash
Key points:
change compacted topic
to speed up recoveries. change compacted topics
are compacted so that theres a unique value per key meaning, each key contains the last value.change logs
that one can query via API