Streams, flows and storms – how not to drown with your data?

The landscape of stream processing frameworks gets more and more complicated. Some time ago Apache Storm was the only boy/girl in town. After that Spark Streaming emerged, then Apache Flink – European answer to Storm & Spark. What’s more – lately Apache Beam entered incubation stage – aiming at API unification and backed by Google.

There's also Kafka Streams that appeared just few days ago.

Full abstract

In the talk I’d like to clear things a bit to help listeners (and myself) decide – which of above fit particular needs and which ones are stable enough to be used without fear. We'll focus not on particular frameworks, but what problems arise in stream applications and how to deal with them. This includes subjects such as raw throughput, latency, window semantics, handling out of order events, storing state, processing guarantees and so on.

As for solutions, the talk focuses on Apache Flink & Kafka Streams. Each will be discussed briefly together with short demo. The talk ends with few general tips on how to choose technology for different types of projects.