Getting Started with Apache Flink: Essential Patterns and Best Practices

Breakout Session

This session provides a comprehensive introduction to Apache Flink for developers and architects who seek to build streaming solutions that are resilient, efficient, and maintainable. I will move through three critical layers of Flink development:

1. Establish a solid foundation based on well-engineered data products

You will learn best practices for:

Managing formats and schemas for the long term.

Ensuring data integrity and implementing error handling.

Working with streams of immutable records vs. streams with updates.

Handling the nuances of watermarking and late-data strategies.

2. Compose solutions from event streaming patterns

Rather than writing monolithic scripts, I will show you how to decompose complex problems using reusable components based on these design patterns:

Deduplication: removing duplicate events

Correlation: linking related events across streams (e.g., orders and their shipments)

Aggregation: computing real-time analytics

Enrichment: adding context to events from reference data

Pattern matching: detecting sequences or anomalies in event streams

3. Insist on operational excellence