Skip to main content

Kafka Streams and How They Work

Posted on October 29, 2019 by CraigParkerCraigParker

Kafka streams seem like a daunting subject to many learners, but they don’t have to be. Just think of a stream as a sequence of events. In fact, when I put together information for this blog post, I joked that getting all this data would be like drinking from a waterfall. Chad (the Training Architect that created our new Kafka course) was able to take it a step further, and we went off on a tangent. This will help to explain it:

Understanding Kafka Streams

When envisioning a Kafka cluster, start with the data being a river and waterfall. The water comes down and falls into a lake (Kafka itself), and now you can do different things with it. Dig irrigation ditches to send some data off in different directions. Treat the water in streams, or don’t, depending on what its end use is. Make it potable in one (for drinking), filter the particles out of it in another (for bathing), or leave it alone if people just want to water their gardens with it.

Kafka streams and topics are like irrigation ditches and water treatment processes. Does that make sense?

In the real world, Kafka streams can be:

  • Credit card transactions
  • Stock trades
  • Package delivery information
  • Network events

There are more examples because Kafka streams can be just about anything. They don’t rely on any kind of external framework. This also means that developers can consume, process, and produce events in their own apps, without having to worry about a strict set of framework rules for accessing it. Let’s dig into one and see how things work, in practice, one that you’re probably familiar with…

Real-World Example

Guess what? Netflix uses Kafka. They get raw data (the river and the waterfall) from you (what you’ve watched) and send it into Kafka. Then that data is queried or manipulated somehow (treated, like water in our little analogy), and what comes out is what you see as “watch this next.” Each time you press a button on your Roku remote, that’s a piece of data (an event, in Kafka terminology) getting dumped into the river.

A little side note here: I’ve got genuine Kafka streams question if I ever run into someone at Netflix. For some reason, the setup in my bedroom doesn’t remember what the last episode of Walking Dead was that I watched out in the living room…

 

Want to Learn Kafka Streams?

It looks like Kafka is on the way to replacing databases in a lot of big data situations. If you’re currently involved in your organization’s backend data infrastructure, you need to check out, Apache Kafka Deep Dive. If you’re thinking of getting into big data, then you need to check out Chad’s course.

Be sure to subscribe to be notified when we release even more information on the latest technologies and courses available.

0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *