Skip to main content

Working with KSQL Streams

Hands-On Lab

 

Photo of Will Boyd

Will Boyd

DevOps Team Lead in Content

Length

00:30:00

Difficulty

Intermediate

KSQL provides a powerful and flexible interface for Kafka's stream processing features. With KSQL, you can even build data processing pipelines without needing to write your own Kafka Streams applications. In this lab, we will solve a simple data processing use case using KSQL. We will create a stream from an existing topic, and we will output the data in a processed form to an output topic using a persistent streaming query.

What are Hands-On Labs?

Hands-On Labs are scenario-based learning environments where learners can practice without consequences. Don't compromise a system or waste money on expensive downloads. Practice real-world skills without the real-world risk, no assembly required.

Working with KSQL Streams

Introduction

KSQL provides a powerful and flexible interface for Kafka's stream processing features. With KSQL, you can even build data processing pipelines without needing to write your own Kafka Streams applications. In this lab, we will solve a simple data processing use case using KSQL. We will create a stream from an existing topic, and we will output the data in a processed form to an output topic using a persistent streaming query.

Solution

Log in to the lab server using the credentials provided on the hands-on lab page:

ssh cloud_user@PUBLIC_IP_ADDRESS

Create a Stream to Pull Data in from the Topic

  1. Start a KSQL session:

    sudo ksql
  2. Set auto.offset.reset to earliest:

    SET 'auto.offset.reset' = 'earliest';
  3. Look at the data in the member_signups topic:

    PRINT 'member_signups' FROM BEGINNING;
  4. Create a stream from the topic:

    CREATE STREAM member_signups
      (firstname VARCHAR,
        lastname VARCHAR,
        email_notifications BOOLEAN)
      WITH (KAFKA_TOPIC='member_signups',
        VALUE_FORMAT='DELIMITED');

Create a Persistent Streaming Query to Write Data to the Output Topic in Real Time

  1. Create the persistent streaming query:

    CREATE STREAM member_signups_email AS
      SELECT * FROM member_signups WHERE email_notifications=true;
  2. View the data in the output topic to verify that everything is working:

    PRINT 'MEMBER_SIGNUPS_EMAIL' FROM BEGINNING;

Conclusion

Congratulations on successfully completing this hands-on lab!