Apache Kafka®, Kafka Streams, and ksqlDB to demonstrate real use cases -2 (Primitive keys and values)

How to use the console consumer to read non-string primitive keys and values

Question:

How do I specify key and value deserializers when running the Kafka console consumer?

Example use case:

You want to inspect/debug records written to a topic. Each record key and value is a long and double, respectively. In this tutorial you’ll learn how to specify key and value deserializers with the console consumer.

Initialize the project

To get started, make a new directory anywhere you’d like for this project:

mkdir console-consumer-primitive-keys-values && cd console-consumer-primitive-keys-values
2
Get Confluent Platform

Next, create the following docker-compose.yml file to obtain Confluent Platform.

---version: '2'services:  zookeeper:    image: confluentinc/cp-zookeeper:6.1.0    hostname: zookeeper    container_name: zookeeper    ports:      - "2181:2181"    environment:      ZOOKEEPER_CLIENT_PORT: 2181      ZOOKEEPER_TICK_TIME: 2000  broker:    image: confluentinc/cp-kafka:6.1.0    hostname: broker    container_name: broker    depends_on:      - zookeeper    ports:      - "29092:29092"    environment:      KAFKA_BROKER_ID: 1      KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9092,PLAINTEXT_HOST://localhost:29092      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0  schema-registry:    image: confluentinc/cp-schema-registry:6.1.0    hostname: schema-registry    container_name: schema-registry    depends_on:      - broker    ports:      - "8081:8081"    environment:      SCHEMA_REGISTRY_HOST_NAME: schema-registry      SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: 'broker:9092'  ksqldb-server:    image: confluentinc/ksqldb-server:0.8.1    hostname: ksqldb    container_name: ksqldb    depends_on:      - broker    ports:      - "8088:8088"    environment:      KSQL_LISTENERS: http://0.0.0.0:8088      KSQL_BOOTSTRAP_SERVERS: broker:9092      KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"      KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"      KSQL_KSQL_SCHEMA_REGISTRY_URL: http://schema-registry:8081      KSQL_KSQL_HIDDEN_TOPICS: '^_.*'      # Setting KSQL_KSQL_CONNECT_WORKER_CONFIG enables embedded Kafka Connect      KSQL_KSQL_CONNECT_WORKER_CONFIG: "/connect/connect.properties"      # Kafka Connect config below      KSQL_CONNECT_BOOTSTRAP_SERVERS: "broker:9092"      KSQL_CONNECT_REST_ADVERTISED_HOST_NAME: 'ksqldb'      KSQL_CONNECT_REST_PORT: 8083      KSQL_CONNECT_GROUP_ID: ksqldb-kafka-connect-group-01      KSQL_CONNECT_CONFIG_STORAGE_TOPIC: _ksqldb-kafka-connect-group-01-configs      KSQL_CONNECT_OFFSET_STORAGE_TOPIC: _ksqldb-kafka-connect-group-01-offsets      KSQL_CONNECT_STATUS_STORAGE_TOPIC: _ksqldb-kafka-connect-group-01-status      KSQL_CONNECT_KEY_CONVERTER: org.apache.kafka.connect.converters.LongConverter      KSQL_CONNECT_VALUE_CONVERTER: org.apache.kafka.connect.converters.DoubleConverter      KSQL_CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: '1'      KSQL_CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: '1'      KSQL_CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: '1'      KSQL_CONNECT_LOG4J_APPENDER_STDOUT_LAYOUT_CONVERSIONPATTERN: "[%d] %p %X{connector.context}%m (%c:%L)%n"      KSQL_CONNECT_PLUGIN_PATH: '/usr/share/java,/usr/share/confluent-hub-components/,/data/connect-jars'    command:      # In the command section, $ are replaced with $$ to avoid the error 'Invalid interpolation format for "command" option'      - bash      - -c      - |        echo "Installing connector plugins"        mkdir -p /usr/share/confluent-hub-components/        confluent-hub install --no-prompt --component-dir /usr/share/confluent-hub-components/ mdrogalis/voluble:0.3.0        #        echo "Launching ksqlDB"        /usr/bin/docker/run &         echo "Waiting for Kafka Connect to start listening on localhost ⏳"        while : ; do          curl_status=$$(curl -s -o /dev/null -w %{http_code} http://localhost:8083/connectors)          echo -e $$(date) " Kafka Connect listener HTTP state: " $$curl_status " (waiting for 200)"          if [ $$curl_status -eq 200 ] ; then            break          fi          sleep 5        done        echo -e "\n--\n+> Creating Data Generator source"        curl -X PUT http://localhost:8083/connectors/example/config \         -i -H "Content-Type: application/json" -d'{            "connector.class": "io.mdrogalis.voluble.VolubleSourceConnector",            "genkp.example.with" : "#{Number.randomNumber}",            "genvp.example.with" : "#{Address.latitude}",            "topic.example.records.exactly" : 10,            "transforms": "CastLong,CastDouble",            "transforms.CastLong.type": "org.apache.kafka.connect.transforms.Cast$$Key",            "transforms.CastLong.spec": "int64",            "transforms.CastDouble.type": "org.apache.kafka.connect.transforms.Cast$$Value",            "transforms.CastDouble.spec": "float64",            "key.converter": "org.apache.kafka.connect.converters.LongConverter",            "key.converter.schemas.enable" : "false",            "value.converter": "org.apache.kafka.connect.converters.DoubleConverter",            "value.converter.schemas.enable" : "false",            "tasks.max": 1        }'        sleep infinity

Currently, the console producer only writes strings into Kafka, but we want to work with non-string primitives and the console consumer. So in this tutorial, your docker-compose.yml file will also create a source connector embedded in ksqldb-server to populate a topic with keys of type long and values of type double.

And launch it by running:

docker-compose up -d

After you’ve ran the docker-compose up -d command, wait 30 seconds to a 1 minute before executing the next step.

3
Start an initial console consumer

Now you’ll use a topic created in the previous step. Your focus here is reading values on the command line with the console consumer. The records have the format of key = Long and value = Double.

First let’s open a new terminal window and start a shell in the broker container:

docker-compose exec broker bash

Now let’s start up a console consumer to read some records. Run this command in the container shell:

kafka-console-consumer --topic example --bootstrap-server broker:9092 \ --from-beginning \ --property print.key=true \ --property key.separator=" : "

After the consumer starts up, you’ll get some output, but nothing readable is on the screen. You should see something similar to this:

!? : @'?u_?mYJ? : ?(?,????c : @T??????? : @S{??ދ?? : @F!?u??? : ??{??%??#f : @S???A : ?T5Ni?^? : ?κ?e : @>ֈ&??? 

The output looks like this because you are consuming records with a Long key and a Double value, but you haven’t provided the correct deserializer for longs or doubles.

Close the consumer with a Ctrl+C command, but keep the container shell open.

4
Specify key and value deserializers

Now let’s update your command to the console consumer to specify the deserializer for keys and values.

In the same window of your previous console consumer run this updated command in the container shell:

kafka-console-consumer --topic example --bootstrap-server broker:9092 \ --from-beginning \ --property print.key=true \ --property key.separator=" : " \ --key-deserializer "org.apache.kafka.common.serialization.LongDeserializer" \ --value-deserializer "org.apache.kafka.common.serialization.DoubleDeserializer"

After the consumer starts you should see readable numbers similar to this:

8666 : 11.91495819146 : -12.03479934659 : 83.75128310944 : 76.023163302796 : 44.264754374486 : 1.030215169428755 : 79.2962064 : -80.8329114 : -2.22594187 : 30.838015Processed a total of 10 messages

Now you know how to configure a console consumer to handle primitive types – DoubleLongFloatInteger and Short.

Strings are the default value so you don’t have to specify a deserializer for those.

5
Clean up

You’re all done now!

Go back to your open windows and stop any console consumers with a CTRL+C then close the container shells with a Ctrl+D command. Then you can shut down the docker container by running:

docker-compose down

Popular posts from this blog

Window function in PySpark with Joins example using 2 Dataframes (inner join)

Complex SQL: fetch the users who logged in consecutively 3 or more times (lead perfect example)

Credit Card Data Analysis using PySpark (how to use auto broadcast join after disabling it)