Kafka Streams Testing

A Deep Dive

Ivan Ponomarev, John Roesler

Who Are We

Ivan Ponomarev:

Software Engineer at KURS, tutor at MIPT
Apache Kafka Contributor

Who Are We

John Roesler:

Software Engineer at Confluent
Apache Kafka Committer and PMC member

Kafka Streams Testing: A Deep Dive

Purpose: cover testing methodologies for Kafka Streams
- "Unit" Testing: TopologyTestDriver
- Integration Testing: KafkaStreams
Start with motivating example (from Ivan’s production)
A flawed testing approach: unit testing doesn’t work for this example
Deep-dive into the testing framework
Correctly testing the example with integration tests

The task

Save different source IDs in the database

The problem

Too many writes to the database

The solution

Let’s deduplicate using Kafka Streams!

`TopologyTestDriver`

`TopologyTestDriver` capabilities

What is being sent/received

TestInputTopic methods

TestOutputTopic methods

A single value

pipeInput (V)

V readValue ()

A key/value pair

pipeInput (K, V)

KeyValue<K,V> readKeyValue()

`TopologyTestDriver` capabilities

What is being sent/received

TestInputTopic methods

TestOutputTopic methods

A list of values

pipeValueList (List<V>)

List<V> readValuesToList()

A list of key/value pairs

pipeKeyValueList (List<KeyValue<K,V>>)

List<KeyValue<K,V>> readKeyValuesToList()

Map<K,V> readKeyValuesToMap()

`TopologyTestDriver` capabilities

What is being sent/received

TestInputTopic methods

TestOutputTopic methods

A list of Records

pipeRecordList (List<? extends TestRecord<K, V>>)

List<TestRecord<K, V>> readRecordsToList()

Demo

Spring Boot app
Let’s do some test-driven development and first write a test
Writing a test with TTDriver

A "Simple Solution"

A "Simple Solution"

A "Simple Solution"

A "Simple Solution"

A "Simple Solution"

Demo

writing the topology
TopologyTestDriver test is green

Tests are green

Should we run this in production?

What we saw in production:

Why it’s not working

Kafka Streams

TopologyTestDriver

is a big data streaming framework

is a fast, deterministic testing framework

Why it’s not working

Kafka Streams

TopologyTestDriver

is a big data streaming framework

designed for high throughput
throughput demands batching, buffering, caching, etc.
caching is the culprit in this example

is a fast, deterministic testing framework

Why it’s not working

Kafka Streams

TopologyTestDriver

is a big data streaming framework

designed for high throughput
throughput demands batching, buffering, caching, etc.
caching is the culprit in this example

is a fast, deterministic testing framework

designed for synchronous, immediate results
flush cache after every update

Why it’s not working

Caching in Kafka Streams

don’t immediately emit every aggregation result
"soak up" repeated updates to the same key’s aggregation
configure cache size: max.bytes.buffering (10MB)
configure cache flush interval: commit.interval.ms (30s)
emit latest result on flush or eviction

Why it’s not working

Demo

TopologyTestDriver vs. Kafka Streams execution loop

Kafka Streams execution loop

TopologyTestDriver execution loop

What else?

What are other problems that can’t be surfaced with TopologyTestDriver?

`TopologyTestDriver`: single partition

Kafka Streams: co-partitioning problems

`TopologyTestDriver`: "Fused" subtopologies

ToplogyTestDriver

Kafka Streams

Timing

stream-stream joins can behave differently (pipeInput order vs. timestamp order)
logic that depends on stream time (such as suppress) can behave differently

Should we trust StackOverflow?

Using Transformer

Let’s run tests on real Kafka!

EmbeddedKafka
TestContainers

EmbeddedKafka vs TestContainers

EmbeddedKafka

TestContainers

Pro:
- Just pull in a dependency
Contra:
- Pulls in Scala
- Runs in the same JVM

Pro
- Runs Kafka isolated in Docker
- Not only for Kafka testing
Contra
- Needs Docker
- Requires some time for the first start

Demo

Writing TestContainers test
- An easy part: pushing messages to Kafka
- A not so easy part: how do we check the output?

Demo

Deduplication: the correct implementation
Now the test is green, but takes 5 seconds!

Does it have to be so slow?

List actual = new ArrayList<>();

while (true) {
  ConsumerRecords<String, String> records =
    KafkaTestUtils.getRecords(consumer, 5000 /* timeout in ms */);
  if (records.isEmpty()) break;
  for (ConsumerRecord<String, String> rec : records) {
    actual.add(rec.value());
  }
}

assertEquals(List.of("A", "B"), actual);

Awaitility

Awaitility.await().atMost(10, SECONDS).until(
                 () -> List.of("A", "B").equals(actual));

Awaitility

Awaitility.await().atMost(10, SECONDS).until(
                 () -> List.of("A", "B").equals(actual));

Things we must keep in mind

Cooperative termination
Thread-safe data structure

Demo

Green test runs faster

Will any extra messages appear?

We can wait for extra 5 seconds (bad choice)
We can put a 'marker record' at the end of the input and wait for it to appear in the output (not always possible)

Summary

Both TopologyTestDriver and integration tests are needed
Write unit tests with TopologyTestDriver. When it fails to surface the problem, use integration tests.
Know the limitations of TopologyTestDriver.
Understand the difficulties and limitations of asynchronous testing.

KIP-655 is under discussion

Useful links

Confluent blog: Testing Kafka Streams – A Deep Dive
pro.kafka: Russian Kafka chat in Telegram: https://t.me/proKafka
Confluent community Slack: https://cnfl.io/slack

Thank you!

Ivan Ponomarev

John Roesler

Kafka Streams Testing

A Deep Dive

Who Are We

Who Are We

Kafka Streams Testing: A Deep Dive

The task

The problem

The solution

TopologyTestDriver

TopologyTestDriver capabilities

TopologyTestDriver capabilities

TopologyTestDriver capabilities

Demo

A "Simple Solution"

A "Simple Solution"

A "Simple Solution"

A "Simple Solution"

A "Simple Solution"

Demo

Tests are green

What we saw in production:

Why it’s not working

Why it’s not working

Why it’s not working

Why it’s not working

Why it’s not working

Why it’s not working

Why it’s not working

Why it’s not working

Why it’s not working

Why it’s not working

Demo

Kafka Streams execution loop

Kafka Streams execution loop

Kafka Streams execution loop

Kafka Streams execution loop

Kafka Streams execution loop

Kafka Streams execution loop

Kafka Streams execution loop

Kafka Streams execution loop

TopologyTestDriver execution loop

TopologyTestDriver execution loop

TopologyTestDriver execution loop

TopologyTestDriver execution loop

TopologyTestDriver execution loop

TopologyTestDriver execution loop

TopologyTestDriver execution loop

What else?

TopologyTestDriver: single partition

Kafka Streams: co-partitioning problems

TopologyTestDriver: "Fused" subtopologies

Timing

Should we trust StackOverflow?

Using Transformer

Using Transformer

Using Transformer

Using Transformer

Using Transformer

Let’s run tests on real Kafka!

EmbeddedKafka vs TestContainers

Demo

Demo

Does it have to be so slow?

Awaitility

Awaitility

Things we must keep in mind

Demo

Will any extra messages appear?

Summary

KIP-655 is under discussion

Useful links

Thank you!

`TopologyTestDriver`

`TopologyTestDriver` capabilities

`TopologyTestDriver` capabilities

`TopologyTestDriver` capabilities

`TopologyTestDriver`: single partition

`TopologyTestDriver`: "Fused" subtopologies