August 19, 2021 | Episode 172

Placing Apache Kafka at the Heart of a Data Revolution at Saxo Bank

  • Transcript
  • Notes

Monolithic applications present challenges for organizations like Saxo Bank, including difficulties when it comes to transitioning to cloud, data efficiency, and performing data management in a regulated environment. Graham Stirling, the head of data platforms at Saxo Bank and also a self-proclaimed recovering architect on the pathway to delivery, shares his experience over the last 2.5 years as Saxo Bank placed Apache Kafka® at the heart of their company—something they call a data revolution. 

Before adopting Kafka, Saxo Bank encountered scalability problems. They previously relied on a centralized data engineering team, using the database as an integration point and looking to their data warehouse as the center of the analytical universe. However, this needed to evolve. For a better data strategy, Graham turned his attention towards embracing a data mesh architecture: 

  1. Create a self-serve platform that enables domain teams to publish and consume data assets
  2. Federate ownership of domain data models and centralize oversights to allow a standard language to emerge while ensuring information efficiency 
  3. Believe in the principle of data as a product to improve business decisions and processes 

Data mesh was first defined by Zhamak Dehghani in 2019, as a type of data platform architecture paradigm and has now become an integral part of Saxo Bank’s approach to data in motion. 

Using a combination of Kafka GitOps, pipelines, and metadata, Graham intended to free domain teams from having to think about the mechanics, such as connector deployment, language binding, style guide adherence, and data handling of personally identifiable information (PII). 

To reduce operational complexity, Graham recognized the importance of using Confluent Schema Registry as a serving layer for metadata. Saxo Bank authored schemes with Avro IDL for composability and standardization and later made a switch over to Uber’s Buf for strongly typed metadata. A further layer of metadata allows Saxo Bank to define FpML-like coding schemes to specify information classification, reference external standards, and link semantically related concepts. 

By embarking on the data mesh operating model, Saxo Bank scales data processing in a way that was previously unimaginable, allowing them to generate value sustainably and to be more efficient with data usage. 

Tune in to this episode to learn more about the following:

  • Data mesh
  • Topic/schema as an API
  • Data as a product
  • Kafka as a fundamental building block of data strategy

Continue Listening

Episode 173August 26, 2021 | 29 min

Using Apache Kafka and ksqlDB for Data Replication at Bolt

What does a ride-hailing app that offers micromobility and food delivery services have to do with data in motion? In this episode, Ruslan Gibaiev (Data Architect, Bolt) shares about Bolt’s road to adopting Apache Kafka and ksqlDB for stream processing to replicate data from transactional databases to analytical warehouses.

Episode 174August 31, 2021 | 31 min

Multi-Cluster Apache Kafka with Cluster Linking ft. Nikhil Bhatia

Infrastructure needs to react in real time to support globally distributed events, such as cloud migration, IoT, edge data collection, and disaster recovery. To provide a seamless yet cloud-native, cross-cluster topic replication experience, Nikhil Bhatia (Principal Engineer I, Product Infrastructure, Confluent) and the team engineered a solution called Cluster Linking. Available on Confluent Cloud, Cluster Linking is an API that enables Apache Kafka to work across multi-datacenters, making it possible to design globally available distributed systems.

Episode 175September 9, 2021 | 34 min

What Is Data Mesh, and How Does it Work? ft. Zhamak Dehghani

The data mesh architectural paradigm shift is all about moving analytical data away from a monolithic data warehouse or data lake into a distributed architecture—allowing data to be shared for analytical purposes in real time, right at the point of origin. The idea of data mesh was introduced by Zhamak Dehghani (Director of Emerging Technologies, Thoughtworks) in 2019. Here, she provides an introduction to data mesh and the fundamental problems that it’s trying to solve.

Got questions?

If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.

Email Us

Never miss an episode!

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.