Breakout Session
Kafka is the central nervous system of organizational data that connects heterogenous data sources and sinks together. KIP-98 introduced exactly-once semantics for data processing within Kafka. KIP-939 brings exactly-once semantics to the next level and enables exactly-once semantics between Kafka and a database.
Say we want to use Kafka as a log of events or changes, but also want to have a relational database and/or search index on the data. How can we keep these in sync? We can try to write data to both Kafka and a database, but if failures happen Kafka and database will diverge -- we either miss events in Kafka for changes that are committed in database or we have extra events in Kafka for changes that fail to commit to database.
KIP-939 solves the problem by supporting two phase commit and proving atomic dual-write recipe that can be used to atomically write data to Kafka and a database. With dual write recipe we can make sure that events are committed to Kafka iff changes are committed to the database, even in the presence of failures. For detailed info on KIP-939 please refer to https://cwiki.apache.org/confluence/display/KAFKA/KIP-939:+Support+Participation+in+2PC.
In this talk I'll cover the support for two phase commit in Kafka and show how this functionality can be used to implement atomic dual writes into Kafka and a SQL or a NoSQL database.