Talk

Keep Your Cache Always Fresh with Debezium!
Conference (INTERMEDIATE level)
Room 9
Score 0.10
Score 0.15
Score 0.16
Score 0.16
The match becomes increasingly accurate as the similarity score approaches zero.

The saying goes that there are only two hard things in Computer Science: cache invalidation, and naming things. Well, turns out the first one is solved actually đŸ˜‰

Join us for this session to learn how to keep read views of your data in distributed caches close to your users, always kept in sync with your primary data stores change data capture. You will learn how to

- Implement a low-latency data pipeline for cache updates based on Debezium, Apache Kafka, and Infinispan

- Create denormalized views of your data using Kafka Streams and make them accessible via plain key look-ups from a cache cluster close by

- Propagate updates between cache clusters using cross-site replication

We'll also touch on some advanced concepts, such as detecting and rejecting writes to the system of record which are derived from outdated cached state, and show in a demo how all the pieces come together, of course connected via Apache Kafka.

Gunnar Morling
Red Hat

Gunnar Morling is a software engineer and open-source enthusiast by heart. He is working on Debezium, a distributed platform for change data capture. He is a Java Champion and has founded multiple open source projects such as JfrUnit, kcctl, and MapStruct. Gunnar is an avid blogger (morling.dev) and has spoken at a wide range of conferences like QCon, Java One, and Devoxx. He lives in Hamburg, Germany.

Generated Summary
WARNING: This summary was generated using GPT based on the transcript, as a result spelling mistakes and more importantly hallucinations can be present.

Creating an E-commerce Application with Caches
System of Record Database and Deployments in Different Geographies
The goal of this talk is to create denormalized and localized views close to users in order to improve performance, with the majority of requests being reads. This is known as Command Query Responsibility Segregation (CQRS). This architecture uses caches to reduce latency and improve efficiency, by storing a canonical version of the data in a Postgres database and replicating it to different caches in different geographies.
Infinite Spin
The talk focused on Phoenix Pin, a Redis sponsored project. Infinity Spin is a highly flexible and versatile cache that can be used for in-memory requests, deployed in clusters, and used with Java.net, Node.js, and more. It is fault-tolerant with replicable data and can be queried like a database. It can also be embedded in applications, allowing for scalability and stateless applications.
Change Data Capture (CDC)
The challenge of keeping the data in sync between the caches and the system of record was addressed. Dual writes can be used to update multiple resources, such as a database and cache, but it is not always the best idea due to availability concerns. CDC is a solution proposed for distributed transactions between multiple resources. It works by tapping into the transaction log of a database and extracting changes such as inserts, updates, and deletes. Division is an open source project for implementing CDC and is typically used with Apache Kafka. It also supports near caching to keep data in-memory, and cross-site replication to keep multiple clusters of Infinite Spin in different locations in sync.
Conclusion
This presentation discussed ways to reduce latency for read requests by using local caches and denormalization with Kafka streams. The advantages of this approach include lower latencies, reduced load on the primary database, and increased availability of the system. However, this approach adds complexity and requires understanding of eventual consistency. Resources for further exploration are provided.
You can also ask questions on the complete talk using Devoxx Insights