August 19, 2021 | Episode 172

Placing Apache Kafka at the Heart of a Data Revolution at Saxo Bank

  • Transcript
  • Notes

Tim Berglund:

Everybody's talking about data mesh these days, but is anybody doing it yet? Well, the answer is yes. I talked to Graham Stirling of Saxo Bank about the work he's done there in the last few years to build out not just infrastructure, but really cultural change that puts data mesh into effect. Listen in on today's episode of Streaming Audio, a podcast about Kafka, Confluent, and the Cloud.

Tim Berglund:

Hello and welcome to another episode of Streaming Audio. I am your host, Tim Berglund and I am joined today by Graham Stirling. Graham is an architect at Saxo Bank, and Graham, you had a quote that I'm just looking in my notes here. Head of data platforms at Saxo Bank, a recovering architect now on the path to delivery. Welcome to the show.

Graham Stirling:

Thank you. Yes, yes. Thanks, Tim, just glad to be on the show.

Tim Berglund:

You bet, so we're going to talk about data mesh today, which is an increasingly hot topic and you as a recovering architect, Saxo as an organization, or some people who are doing this, but I always like to start by asking about you. Particularly, recovering architect on the way to delivery. I think we have some common sensibilities there, so what do you mean by that? Really, what do you, what do you do at Saxo?

Graham Stirling:

Well, I suppose I started out with a computer science degree a long time ago. I think largely motivated by the desire not to get a proper job. I continued in academia for some time, but then human-computer interaction of all things. And, then I eventually moved into financial services, initially leaning towards integration and then data management.

Graham Stirling:

So I've worked at a number of banks, both big and small. I joined Saxo Bank almost was three years ago. Moving to Copenhagen was never part of the life plan, but it's been a great adventure thus far. I initially joined as an architect, but the lure of delivery was too great, I guess. So it's been quite refreshing, I suppose. Not only to create the vision but also to take a key role and actually the execution of it.

Tim Berglund:

Actually building stuff.

Graham Stirling:

Absolutely, there's nothing quite like building stuff, is there?

Tim Berglund:

There really isn't, and I think to some degree that helps you escape what I call the non-coding architect anti-pattern, which is sometimes a dangerous thing in large companies, right?

Graham Stirling:

Indeed.

Tim Berglund:

You can get yourself detached from how to build things, and that's in an applied discipline, that's potentially dangerous. So-

Graham Stirling:

Indeed, things are very easy at PowerPoint level, really, aren't they? I guess, yeah.

Tim Berglund:

Yeah, I mean, that's a lot of what I do, so I know. At some point, you need to write some code and make sure that things work in the real world.

Graham Stirling:

Indeed.

Tim Berglund:

Now modernization, we talk about this catchphrase, application modernization like it's a new thing. I mean, everybody is always in the process of modernizing some things in their stack, in their organization. Even constrained, when I say everybody in an IT sense, in a community of software developers who write programs to make businesses run. There's always old stuff, and we're always trying to make the old stuff better. So there's always modernization happening, and it seems like there are in the last, I'd say five years, modernization has meant refactoring a monolith to microservices. And that's just been a sentiment. And that's in fact when I hear Confluent salespeople say application modernization, that's just what they mean. And that's cool, not a thing in the world wrong with that, that needs to happen.

Tim Berglund:

But there's also... And I think this touches on what we're going to talk about today. There's analytics modernization. I think to some degree... And, I don't have a background as a DBA. I have a background as a developer, but to some degree these days, just to say data warehouse and to refer to that legacy thing, is kind of like saying monolith. It's big, it's brutal, it's slow in terms of latency. It's got this batch kind of way of running, and so there's also a program of modernizing that. And you've got a really neat story to tell there, so let me just kick you off. I mean, I want to frame it with that, but tell us your story. Let's talk about it.

Graham Stirling:

Yeah, so I suppose one of the things that we'll learn from my tenure with a few of the big banks was the... I mean, really what they needed more than anything else was it a... I hate to use this term, here's my architect coming through, but a data fabric that ultimately everyone can plug into in some shape or form. And, this was particularly felt by those business processes, which sat right at the end of the data supply chain. So, the liquidity risk is a very good example. Global banks have a regulatory requirement these days to determine whether they have enough liquid assets that can easily be converted into cash to fulfill their liabilities. So-

Tim Berglund:

Yes, yes some percentage of their assets have to be, quoting quote, liquid.

Graham Stirling:

That's right, obviously, we don't want to be too heavily reliant on mortgage-backed securities, for example. That wouldn't be a good thing though, would it? But the counts-

Tim Berglund:

[crosstalk 00:05:38] might go wrong. [crosstalk 00:05:38]-

Graham Stirling:

No, it'd go wrong. Absolutely, but the actual calculation, those of that domain, for example, has to perform is actually pretty simple. It's a ratio at the end of the day. The real challenge for them is getting a hold of the data in a timely manner. Now of course, the cost of trying to retrofit this within an existing organization, some of which have the GDP of a small country is prohibitive. The organizational challenge is, perhaps, more than anything else, getting everyone to agree on a way forward kind of makes this change really, really hard.

Graham Stirling:

And to be honest, that was one of the real appeals about joining Saxo or the management team had the appetite to put data at the center of the transformation agenda. But, it was small enough that it felt as though there was a chance to do something revolutionary. So, that was one of the big appeals.

Graham Stirling:

Now you mentioned, of course, the industry trend away from monoliths. Of course, we know that monoliths are not necessarily a bad thing towards microservices, and this whole idea of data mesh really follows those same parallels. And the real premise, I guess, is the centralized enterprise data lake or data warehouse programs, or feel to deliver the desired return on investment.

Graham Stirling:

Sure, they've allowed us to scale data processing in a way that was previously unimaginable, but they haven't been accompanied by the cultural change, the operating model that allows us to generate value sustainably. and sustainable is probably the keyword there. I'm sure many of us have seen that happen over the years. And, this is really where this concept of a data mesh, which of course, as you might have talked about during a Kafka Summit really comes into play.

Tim Berglund:

And, yeah I should say, as we record this, we're just almost a week after Kafka Summit Europe with Zhamak Dehghani who gave a keynote on data mesh. She's kind of a leading thinker. As far as I know, she's the one who's really created the term and is driving the conversation forward, so we'll definitely link to her keynote in the show notes. After you're done listening to me and Graham, you should absolutely watch her talk for a good primmer on what data mesh means.

Tim Berglund:

And, it's an early enough concept that pithy little explanations like the one I'm about to give are not helpful, but I'm going to do it anyway. It's doing to data what microservices did to applications. And I mean, then you're like, "Well, I know what a data warehouse is, I still don't know what a data mesh is." It's okay, listen to Graham, we're going to get there but that really is the analogy.

Tim Berglund:

And Graham, you said another thing that I want to have the courage to repeat with respect to friends who work in this business, but you said that the data lake has failed. You had some qualifications in there, you didn't just say it that directly, but has failed to deliver the value that we wanted it to deliver.

Graham Stirling:

Yeah, I think that's fair, and if we think about how... And again, it's not the technology, perhaps, it's the way we've actually gone about delivering change. How we've tackled the build of an enterprise data warehouse in the past. Typically, it would be staffed by a relatively large team, which is responsible for ingesting, cleansing, transforming the data into a form that's useful for the business.

Graham Stirling:

And typically, the team would set outside the actual data domains, I guess, to use one of those important tenants from broad data mesh. So, the team would typically be charged with going out to the far corner of the organization, speaking to the domain teams, trying to figure out what data was on offer and how to go about getting ahold of it. And obviously, to complicate matters, each system probably has grown up with their own unique way of serving up data, file extract services, message queues, data models. Design decisions that would typically have been a result of the expertise, and the team are a consequence of the products is supported.

Tim Berglund:

Spreadsheets in SharePoint, any sort of horror that one might imagine. And, I mean look, that worked. It was better than it not existing, but there was a centralized team that was specialists in data warehousing and you said, ingesting, cleansing, and transforming. And, it's really the ingesting and the cleansing that were the bulk of the investment. The transforming is a crank that you turn, by comparison, it's less labor-intensive because ingesting and transforming, or cleansing every darn thing is so bespoke in the whole process. I'm just kind of soul-crushing.

Graham Stirling:

Yeah, and the transformation is always going to be use case-specific typically, isn't it? Whether you need something very flattened, and why you don't want a star schema. That piece of the puzzle, as you say, always sits with the consumer, typically.

Tim Berglund:

Yeah and I think what I heard you say, as I'm processing it, maybe the problem with data lake approaches was really saying to the organization, "Hey, look, you're doing business in some way. There's this emergent culture of this organization, or really federated group of organizations when it comes to a bank like Saxo. There are acquisitions and mergers, and there's a bunch of parts. Just keep doing that, and dump things in this HDFS cluster or S3 buckets, or whatever it is, cloud Blobstore and some data scientists will do things later." That's probably a ruthlessly uncharitable account of the data lake, but it's not entirely wrong. And that lack of cultural change, I think I heard you say, was part of the problem and why that approach we're saying now didn't deliver.

Graham Stirling:

Indeed, yeah and this whole idea of data mesh kind of, I suppose, turns that paradigm or that approach on its head by going back to that fundamental concept of a domain-driven design. Which is not you, I suppose, but requiring each one of those data domains to serve their data sets in an easily consumable way. And, then that leads on to-

Tim Berglund:

[crosstalk 00:12:16]-

Graham Stirling:

Yeah, so I suppose that then leads on to the next tenant, I guess, of data mesh which is about product thinking or the idea of convergence of data and product thinking. We've got this idea of data itself as a product. It's not like a necessary evil, essentially that we have to move around just to fulfill a business requirement. It is something that we wanted to nurture and grow in its own right. And, ultimately believe that the product's usability comes down to the ease with which it can be discovered, understood, and of course, consumed. What's the time to market for you're consuming, spinning up a new use case, for example.

Tim Berglund:

The very opposite of emitting data exhausts into a Blobstore.

Graham Stirling:

Indeed. Yes, yeah.

Tim Berglund:

So, how do you get there? I guess, do you consider yourself to have gotten there to some degree, and what's that process like? What's it been like over the last three years? How did you start?

Graham Stirling:

Yeah, I mean, I suppose Zhamak, I think the first paper came out about two years ago or thereabouts. And we were already starting to think about, well, how do we do create a modern data architecture? One that we're going to be happy with within 10-years' time, that's going to meet the aspirations of the business in terms of the ability to easily tap into data and do more quicker, faster. But also very much with a view as to, well, how do we also scale the business?

Graham Stirling:

I mean, you mentioned shrinking batch when it was earlier on term, and Saxo is going through a huge growth curve at the moment. So, we're planning for a five times increase in transactional volumes this year, 20 times within the course of the next two years. So, we really needed to think about not only how do we do more quicker, faster with data, but how do we also scale? How do we reach organizational scale across the board?

Graham Stirling:

So that was the original premise, and the more that we started to think about this using the battle scars, the learnings of the past, the more we'd really started to resonate with this idea of a data mesh, so to speak. So I mean, all pretty obvious, really, in many different ways. Just thinking about data in a product-centric way, but that's of course easier said than done realistically, isn't it?

Tim Berglund:

Yeah, for a number of reasons, one of which is that product thinking is now going to get driven to a smaller unit of the organization. So, it's one thing when in the days of let's stand up a centralized data warehouse team, well, then you just need one leader who has a vision for that. You need a Graham Stirling 30 years ago who... Just read this book by... I'm blanking on the name of the two big data warehouse gurus in the '90s. Anyway, you know?

Graham Stirling:

Kimball and Inmon.

Tim Berglund:

Yeah, right, right. So, I just read that book and I have a vision for this, and I want to go drive this, and you build a prototype, and you get some executive buy-ins, but you build a team. You build a team and if you're good at it, becomes a big team and it has a big engine that it runs.

Tim Berglund:

But, now you as that leader can sort of drive that in your little world, but that's inverted in data mesh. It's not one team that's doing all the building and needs all the leadership, it's every team that does every little thing in the business, now needs this sort of data as product paradigm shift and worldview that they need to drink down.

Graham Stirling:

Absolutely-

Tim Berglund:

So, that sounds hard.

Graham Stirling:

... you're right. It is because obviously, the advantage of having a central team is a lot easier really, isn't it? To maintain standards, technical expertise. From that perspective, it's much easier. And though, as soon as you have to evolve the responsibility of your teams taking on the ownership for these data products, and the responsibility of publishing those data assets to your data fabric for one for better. Or, do you either need to onboard a lot of technical expertise to the domain teams, or you've really got to focus on how do you lower that barrier of entry to the platform and I think the latter really is key to sustainable growth here.

Graham Stirling:

So, that your domain teams... Because their focus should really be about delivering business change, I guess, that's why they're in the bank, that's why the page. Rather than worrying about how to ensure your schema is compatible to generate code bindings, what's the approach for dealing with personally identifiable information, so on and so forth. There's a big long list of your non-functional capabilities. And in this context, a self-service platform has to move beyond pure-play infrastructure to one that's really focused on enabling those main teams to publish and, of course, consume data assets the right way.

Tim Berglund:

Right, right so it's not just we have a confluent cloud account and you can put things in topics. But you're really talking about, I would say, API-level tools that you're delivering to teams on the very tactical level.

Graham Stirling:

Indeed, yes.

Tim Berglund:

And then this product thinking, which is much more strategic, much more higher-level kind of thinking. So-

Graham Stirling:

Yes, and in our case here, it's Confluent sits at the core of this foundational level just now. I mean, at this point in time, we run a wrong self-manage cluster on-premises and in the cloud. But, we're theoretically at the point where you switch over to the fully managed service, that should be largely transparent to the natural demand teams, theoretically.

Tim Berglund:

Love it, love it. Yeah, you notice I just worked Confluent Cloud in there, even though you're not the user. Well, I'm obviously assuming you'd be a Confluent Cloud... Don't let this guy get away with that. And so in the work that you've done, it's not as though the need for technology leadership goes away. I think honestly it gets a little more demanding because in the build a central team, well, you just have to be able to build a team and that's hard, but a lot of people can do that.

Tim Berglund:

In what you've done, maybe there's a central team. There's some API layer stuff that somebody's building, you've talked about your passion for delivery, you're building things. There's a Kafka cluster or somewhere that somebody is running, all those things exist. But, now also you have to convince people of ideas and it's everybody in every team and every business unit. And, what's that been like?

Graham Stirling:

It's incredibly difficult actually, so to speak, I think it's really a challenge.

Tim Berglund:

Sounds like my job.

Graham Stirling:

Yeah, absolutely. And I think there is, definitely, an advocacy element to the role as well, but we're always having to look for where are we causing friction, I guess, within the organization. How do we remove that friction and make things easier for our development community? Because, we are very much pitching this as our bank-wide capability from the point of treat capture right through to regulatory reporting, so the ambitions are huge.

Graham Stirling:

And of course, we've set ourselves up as a self-service teams, so the requirements are coming thick and fast from the actual domain team. So, it's always about trying to stay one step ahead of the game. But, I think that has definitely been one of the biggest lessons to date is that we're talking about the adoption of Kafka, event-first thinking, self-describing schemas, data ownership, so on and so forth. This actually is quite a big ask of our domain teams, and to a certain extent, it doesn't matter how much effort you put into that cell service platform and we'll continue to do so.

Graham Stirling:

We've got lots of documentation probably too much, but for the level of adoption that we're looking to achieve, we've ended up creating our team separate from the platform team who solely concerned with enablement, so I guess the platform team which is... And, this was actually an idea which, Zhamak had talked about in terms of how to get this thing off the ground. So for example, a small team, we run like a daily office over sessions or open door any question, any blocker, come along and we'll do our best to answer on the call or take it away. We're creating accelerators that will remove the mystic of developing and deploying stream processors.

Graham Stirling:

So, this allows the platform team to focus on the, let's say, the bigger ticket items without being distracted by what might seem to them small issues. But, each of these small issues can block delivery can act as a drag on adoption. And for those strategic data domains or the domains which have many applications across the bank like an instrument, for example, I guess. Then we offer teams the option of pairing up with our tech SME to get them started, perhaps, for a couple of sprints, I guess, to get them up and running it off the ground.

Tim Berglund:

That sounds ideal, so that central team, yes there's an infrastructure concern, yes there are... We'll just call them API concerns, everybody's got a rapper in a big organization that they use to get people to put stuff into Kafka at a low level, but it really is about teaching people.

Graham Stirling:

It's about teaching people up, absolutely. So, ultimately we want to tell our teams to get started, and publish their assets, and help move us forward but we also want them eventually to become advocates in their own right, essentially. So the demands on that central enablement team go, and we can essentially just replace that with us, a community of practice type idea.

Tim Berglund:

Graham, before we started this conversation, I didn't see the relationship between data mesh and developer advocacy, and now I do. That's great, that makes a lot of sense.

Graham Stirling:

But, is looking out for all, as I say, those points of friction. Though one that we... I've talked with many of the folks at Confluent over the past was initially we'd selected Avro as a serialization format, because that was the thing that you did. But, we soon phoned that all language bindings are not necessarily created equally and that the C# implementation was giving us a bit of a headache. And the C#, I guess, is the predominant language of choice than the bank. Is getting more varied now, I think. You're moving into Python and such-like, but our bread and butter have been C#.

Tim Berglund:

Yeah, and you're not alone as a [crosstalk 00:24:23]-

Graham Stirling:

No.

Tim Berglund:

... institution at all.

Graham Stirling:

So, when you introduced protocol support, we decided to bait bill it and switch over. Again, not being without its challenges, but it's all about thinking, how can we flex the platform, I guess? What can we do to make life easier for our development community?

Tim Berglund:

Absolutely, what's next in this program? I've started to call it a project, but it's bigger than a project in this season of your career and of the evolution of Saxo as data architecture. What are you looking to do to move it forward?

Graham Stirling:

Well, I think there's a lot that we can do in terms of... So, we rely heavily on this concept of operations by pill requests. If you want to create a new topic, you set up a topic definition, you get the PR approved, the topic gets deployed. Same for connectors, and schemas, and even data quality rules, for example. Now, certainly, one area that we know we can improve that developer experience is by giving users a much more transparent, or a clear view as to why, for example, their schema might be failing with compatibility. A change might be causing a break and compatibility rules.

Graham Stirling:

So there are some obvious examples, some fairly low-hanging fruit, I guess, that we can address that will, again, remove those points of friction. But we really want to get to the point whereby, as I said, it's all down to the critical mass of data adoption within the bank. And, we hopefully will soon turn retention to how do we start to exploit this data in a more meaningful way. For example, using a ksqlDB. Looks very interesting, I guess it's been on the roadmap for some time. Us is something like a Druid, I think, for your real-time aggregation. There's a number of use cases where that would fit into play. So there are lots for us to do, realistically, I don't think we're ever going to get bored anytime soon.

Tim Berglund:

My guest today has been Graham Stirling. Graham, thanks for being a part of Streaming Audio.

Graham Stirling:

No, at all. Thank you, Tim.

Tim Berglund:

And there you have it. Thanks for listening to this episode. Now, some important details before you go. Streaming Audio is brought to you by Confluent Developer. That's developer.confluent.io, a website dedicated to helping you learn Kafka, Confluent, and everything in the broader event streaming ecosystem. We've got free video courses, a library of event driven architecture design patterns, executable tutorials covering ksqlDB, Kafka Streams, and core Kafka APIs. There's even an index of episodes of this podcast. So if you take a course on Confluent Developer, you'll have the chance to use Confluent Cloud. When you sign up, use the code PODCAST100 to get an extra $100 of free Confluent Cloud usage.

Tim Berglund:

Anyway, as always, I hope this podcast was helpful to you. If you want to discuss it or ask a question, you can always reach out to me @tlberglund on Twitter. That's T-L-B-E-R-G-L-U-N-D. Or you can leave a comment on the YouTube video if you're watching and not just listening, or reach out in our community Slack or Forum. Both are linked in the show notes. And while you're at it, please subscribe to our YouTube channel and to this podcast, wherever fine podcasts are sold. And if you subscribe through Apple Podcast, be sure to leave us a review there. That helps other people discover us, which we think is a good thing. So thanks for your support, and we'll see you next time.

Monolithic applications present challenges for organizations like Saxo Bank, including difficulties when it comes to transitioning to cloud, data efficiency, and performing data management in a regulated environment. Graham Stirling, the head of data platforms at Saxo Bank and also a self-proclaimed recovering architect on the pathway to delivery, shares his experience over the last 2.5 years as Saxo Bank placed Apache Kafka® at the heart of their company—something they call a data revolution. 

Before adopting Kafka, Saxo Bank encountered scalability problems. They previously relied on a centralized data engineering team, using the database as an integration point and looking to their data warehouse as the center of the analytical universe. However, this needed to evolve. For a better data strategy, Graham turned his attention towards embracing a data mesh architecture: 

  1. Create a self-serve platform that enables domain teams to publish and consume data assets
  2. Federate ownership of domain data models and centralize oversights to allow a standard language to emerge while ensuring information efficiency 
  3. Believe in the principle of data as a product to improve business decisions and processes 

Data mesh was first defined by Zhamak Dehghani in 2019, as a type of data platform architecture paradigm and has now become an integral part of Saxo Bank’s approach to data in motion. 

Using a combination of Kafka GitOps, pipelines, and metadata, Graham intended to free domain teams from having to think about the mechanics, such as connector deployment, language binding, style guide adherence, and data handling of personally identifiable information (PII). 

To reduce operational complexity, Graham recognized the importance of using Confluent Schema Registry as a serving layer for metadata. Saxo Bank authored schemes with Avro IDL for composability and standardization and later made a switch over to Uber’s Buf for strongly typed metadata. A further layer of metadata allows Saxo Bank to define FpML-like coding schemes to specify information classification, reference external standards, and link semantically related concepts. 

By embarking on the data mesh operating model, Saxo Bank scales data processing in a way that was previously unimaginable, allowing them to generate value sustainably and to be more efficient with data usage. 

Tune in to this episode to learn more about the following:

  • Data mesh
  • Topic/schema as an API
  • Data as a product
  • Kafka as a fundamental building block of data strategy

Continue Listening

Episode 173August 26, 2021 | 29 min

Using Apache Kafka and ksqlDB for Data Replication at Bolt

What does a ride-hailing app that offers micromobility and food delivery services have to do with data in motion? In this episode, Ruslan Gibaiev (Data Architect, Bolt) shares about Bolt’s road to adopting Apache Kafka and ksqlDB for stream processing to replicate data from transactional databases to analytical warehouses.

Episode 174August 31, 2021 | 31 min

Multi-Cluster Apache Kafka with Cluster Linking ft. Nikhil Bhatia

Infrastructure needs to react in real time to support globally distributed events, such as cloud migration, IoT, edge data collection, and disaster recovery. To provide a seamless yet cloud-native, cross-cluster topic replication experience, Nikhil Bhatia (Principal Engineer I, Product Infrastructure, Confluent) and the team engineered a solution called Cluster Linking. Available on Confluent Cloud, Cluster Linking is an API that enables Apache Kafka to work across multi-datacenters, making it possible to design globally available distributed systems.

Episode 175September 9, 2021 | 34 min

What Is Data Mesh, and How Does it Work? ft. Zhamak Dehghani

The data mesh architectural paradigm shift is all about moving analytical data away from a monolithic data warehouse or data lake into a distributed architecture—allowing data to be shared for analytical purposes in real time, right at the point of origin. The idea of data mesh was introduced by Zhamak Dehghani (Director of Emerging Technologies, Thoughtworks) in 2019. Here, she provides an introduction to data mesh and the fundamental problems that it’s trying to solve.

Got questions?

If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.

Email Us

Never miss an episode!

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free