In this week's Streaming Audio, we are making a plan for something I hope you don't have to do too often, but you will want to plan for it if you do. Cluster migration, moving all your data, all your topics, all your schemas, your applications, your connectors, everything from one Apache Kafka cluster to another, ideally with a minimum of downtime, a minimum of disruption, and a minimum of pain for you. Joining me to go through it is one of our solutions architects here at Confluent, Michael Dunn. He's definitely an expert. He's helped a lot of different companies migrate their clusters between different environments, and he's going to share some of his hard-earned wisdom.
We talk about why you might need to migrate between clusters, what opportunities it can bring up along the way, what you need to do and plan and find out before you even get started, and of course, how to actually complete the job as seamlessly as possible. As ever, for more Kafka knowledge, remember to check out Developer.Confluent.io, but for now, there are clusters to migrate. So, I'm your host, Kris Jenkins. This is Streaming Audio. Let's get into it. Joining me today is Michael Dunn. Michael, how're you doing?
I'm doing well, Kris. How are you?
I'm very well, very cold though, cold over here in England.
Yeah, I'm in Santa Monica, California, so not quite as cold.
That sounds nicer. I have to come and visit.
It is nicer. Yeah, you do have to come and visit. Yeah, the sun is always out, which is nice. I know in England, that's not always the case.
Not always the case, no, and sometimes completely obscured by the rain. Anyway, you are not here to talk about the weather. You are here to talk about a very different storm that could be going on in an organization, right? We're talking about not schema migration, we're talking about whole Kafka cluster migration, right?
Yes, sir.
That seems like a big, scary topic, but the first question is, why would you? Why would you migrate an entire cluster?
Yeah, so I've worked with a number of customers on some Kafka cluster migrations. To tell you the truth, there's always a number of different motivations. Some, I think are more common than others. I would say the most common is when you're actually migrating from open source Kafka or a different platform to a newer platform, such as moving to MSK or Confluent platform or Cloudera Kafka. I think that that's usually fairly common, because an in-place upgrade and changing binaries on an old service to a new service isn't quite as seamless as it is to just usually move your Kafka cluster. It's usually a lot easier.
Really? Because, I would've thought you could just roll in new nodes and roll out the old ones.
Well, yeah, you certainly can if you're continuing to self-manage it, but especially when you're moving to a cloud service like MSK or Confluent Cloud or things like that, you can't really just flip the switch, and all of a sudden your open source Kafka is in a managed service. But, to your point, if you are moving from open source to Confluent platform for example, you do have the flexibility to do an in-place upgrade. However, another reason that I've found that customers actually will typically move and do a migration to a new cluster is because a rolling upgrade of their existing cluster is not as seamless as you would think and is a lot more disruptive to their services than anticipated.
Typically, because of the way that they've architected their existing cluster, they either have a low number of resources or resources are already constrained, and they've found that if they were to lose a node for whatever reason, they've tried rolling upgrades, that it's fairly disruptive to processing. So, rather than having to either resize their cluster or add resources to their existing cluster, moving to a new cluster gives them that comfort that when they move to this new cluster, they're not going to introduce instability in their existing implementation, and they can actually plan the movements of their applications and things like that a little bit more carefully.
Oh, okay. Yeah, I can see that. I bet there are also a few ... I'm thinking really big companies like banks where migrating your cluster to another provider actually means another department you don't talk to enough.
Things like that as well, exactly, and things like upgrading the platform, too. I think it is in the same vein as moving to a new set of binaries where even during an upgrade, say you're upgrading from Kafka 2.1 to 3.0 or something like that, if you don't have your cluster in a state where it can afford to lose a broker and it introduces a lot of instability, it usually maybe is easier to just spin up a new cluster with the newer version and move the data over. I think that's a little less common, because a lot of people typically architect their Kafka clusters in a way that they should be able to afford a rolling restart and won't cause a lot of instability. But, it's not uncommon that we work with some customers that can't really afford to do a rolling restart or are in a place where it's a little bit easier for them to move to a new cluster.
Okay. You say easy or easier. It doesn't sound easy to me, so clearly you know something I don't, which is the whole reason you're on the podcast.
Yeah, so I guess easy for somebody that's done it, and the solution itself is fairly easy to put together. I think that the most challenging part of any upgrade is the strategy behind it and really understanding the nuances and some of the differences between the existing technologies and solutions that are available to not only move your data, but also keep in mind the other components that are leveraging Kafka. So, if you're using a schema registry, whether it be Red Hat's Apicurio Schema Registry or Confluent Schema registry or using Kafka Connect or using Streams, you always have to keep in mind the impact to those services when you're moving them to a new Kafka cluster as well. Because, typically when you know you're doing a migration, you're not just moving Kafka. You're maybe moving to a new Connect cluster, for example, or a new schema registry cluster.
In order to do that, you just really think through the strategy and have a really well-thought-out plan. But, as long as that's been done and there's been a lot of effort put into that, the migration itself is actually usually fairly straightforward. I can talk through the solutions that are available and some pros and cons to each, but I think that the most important thing, and what I'm hopeful people get out of this podcast is that as long as you think through and really strategize and try to make sure that you've thought through all the edge cases and taken into consideration all the different components that are likely going to move, that you're not going to have too bad of a time when actually doing the migration itself.
Okay, then maybe you should take me through how it's done, and I'll try and think up some things that go wrong along the way.
Yeah, no, please. Yeah, I always love being pecked at in that regard. Yeah, I would say that-
Here I am. Let's say I've got a cluster. I'm going to move it to a different provider, maybe a different major version, something like that. What are my first options?
So, I would say the open source version that's available ... Well, I guess I'll start from maybe the most complex solution, but one that doesn't require any pre-built applications or binaries. That's where you could essentially build your own consumer and producer that consumes the data from one cluster and just produces it to the other one. When doing so, you can use schema or serializers and re-register the schemas in the new cluster. You won't necessarily have to worry about the schema IDs aligning and things like that. We can get into the schema migration a little bit later, but that that's one approach that some customers have done, but that obviously requires a heavier lift from a development perspective. But, then you really have full control as to how you want to move your data over and you really build out the consuming and producing of the data to the new cluster.
So, I'd say that that's probably the least common that I've seen, but some customers do choose to build their own tooling, and it works. It absolutely does, but I think that there are a lot of good tools out there that kind of do the heavy lifting for you. The first one is the Apache Kafka open source MirrorMaker. Originally, MirrorMaker was just, in the earlier versions of Kafka, was strictly just a consumer and a producer. It took care of a lot of the internal manipulation of the data as it moved through the wire over to the new Kafka cluster. It had some built-in checks and balances, but the original MirrorMaker 1 in this case wasn't the easiest to use and had some issues. As of some of the newer versions of Apache Kafka, there's now a MirrorMaker 2. MirrorMaker 2 leverages the Connect framework, and it essentially kind of creates a source connector.
The source connector again consumes from your original cluster and then writes that data to your destination Kafka cluster. Not to get too into the weeds of how it works, but you can run it in a couple ways. There's an executable where you just run it in what we call dedicated version, which kind of spins up some internal connectors within the actual application JVM itself. Also, you can run it as a Connect cluster or as a connector in a Connect cluster, similar to some of the other tools that are available. Basically, the cool things about MirrorMaker, not only does it move the data, it can also sync your consumer offsets, which is also very important, because when you stop reading on one cluster and you want to start reading on the other cluster, you'd like to pick up from where you left off, so you're not either missing messages or reading a bunch of duplicates.
So, you want your consumer groups to think they're in exactly the same place that they used to be.
Exactly, and MirrorMaker 2 can facilitate that with its offset translation technology. It also can do some other cool things where it can keep topic configurations in sync. The one thing it doesn't do well, unless you have auto topic creation enabled, is actually create the topics for you. Typically, you're going to want to create the topics before doing a data migration, because auto topic creation isn't usually best practice, especially in a production cluster, auto topic creation being when a producer produces a message, and the topic doesn't exist. The topic is created with the default configurations on your destination cluster to allow the production to occur. With MirrorMaker, yeah, go on.
That raises the question, if you're creating your own topics manually, is this an opportunity to change the topics configurations in interesting ways, like different number of partitions, different replication, that kind?
Absolutely, and that's honestly something I like to try to think through with my customers before actually doing migration is, what other things can we take care of as part of this migration? Because, a lot of times when a customer does a migration from one cluster to another, they found that originally they had over partitioned their topics or under partitioned their topics. It can be challenging to repartition topics on an existing cluster, or not even necessarily challenging, but just something you don't really want to take on, especially if things are working, and your performance isn't great, but also isn't terrible. You're able to meet your business needs.
But, for example, if you've greatly over partitioned your topics, you have that benefit of creating new topics with a smaller number of partitions to reduce that replication load and the CPU overhead that it takes to replicate all of those partitions in a Kafka cluster and can allow you to recreate and do your data model, if you will, on the new cluster as also part of the migration, so kind of a two birds with one stone scenario. Yeah, that's actually fairly common.
Subject to thinking about ordering, how that's going to change ordering guarantees presumably.
That's why you don't want to change the number of partitions on a currently existing topic, just increasing them, for example, because that will change the distribution of keys amongst your topic. That can definitely cause ordering issues for consumers downstream, because order is only guaranteed per partition per key. A consumer group, if you will, if they're consuming across a number of different partitions, may get some messages for a specific key out of order if you change the number of partitions. By actually changing the number of partitions on a new topic and replicating the data from beginning to end of time, you're not actually ... You don't need to necessarily worry about ordering, because every message with the same key from the earliest offset to the latest offset has been written to the same partition on the destination cluster.
Oh, I see.
The distribution will be different between the two clusters, but you shouldn't run into any problems with your consumer, if you will, consuming messages out of order. Because, again, every message with the same key will go to the same partition and will be written in order.
Of course, yeah, yeah. Okay, that makes sense. Okay, so actually going back a step, that seemed to me to be a headline feature of MirrorMaker, this translation of assets. That's the big difference between just firing up a select star from insert into in KSQL, right?
Yeah, exactly. Yeah, the offset translation is a big win for a lot of the replication technologies are out there, MirrorMaker being one of them. I think that that's one of the reasons why using one of these pre-built solutions is very beneficial. Most of the time, because you can do a migration without translating offsets. It just introduces a little bit more downtime, because if you think about it, you stop your producers, let your consumers be fully caught up, start your consumers from the end offset on the destination, then start your producers again. In reality, you haven't missed any messages. You haven't processed any duplicates. But, you also have introduced more downtime and I guess more of a dance, if you will, of application migration, because you have to really sort of strategize that.
If you have a number of interdependencies and complexity there, it can become a little bit more cumbersome, because one benefit of offset translation is you can migrate your consumers and let your producers just keep writing on the source. You can stop your producers or your consumers, and your producers can keep writing on the source cluster. But, when you cut your consumers over, they'll start from where they left off and just read replicated messages. It gives you that flexibility to migrate consumers first and then producers and split that up. It can be a little bit easier on application teams and on people that are also driving the migration.
Yeah, absolutely. Presumably, you could use that for dry runs, as well.
Yeah, certainly. I think that it's very common to do some sort of, especially in a lower environment, to definitely do some testing in some sort of dry run where you essentially have your live applications, if you will, still running on the source. If you just want to make sure that the same consumer with the same group ID starts from where you had stopped it on the destination, you can start it there. They should honestly be in parallel where the consumer should be running as if they are the same, but on separate clusters.
Okay. Yeah, that makes sense to me. So, that's MirrorMaker and MirrorMaker 2, which is the Connect based version. There are two more options. Why are there two more options, and what are they?
Well, so there are two more options because on the Confluent side, which is the company that I work for, we-
I've heard of them.
Yes, I am sure you have. Confluent has developed Replicator, which is its own parallel to MirrorMaker 2. It's the Confluent production-ized battle tested version of what MirrorMaker 2 is able to do. Again, it can also be run as an executable. It can also be run as a connector in a Connect cluster. It works not really in the same way, but functionally it does the same thing where basically it consumes from your source and writes to your destination with the internal kind of Connect driver. So, Replicator is a Confluent enterprise piece of technology.
I would say the benefits it has over MirrorMaker is just that it is supported by Confluent if you have a Confluent support license. It's one of those things where MirrorMaker is a great tool and does the job. But, I would say documentation is a little bit light. It's kind of hard. I think the KIP is really the only thing that's out there and the Git repository. There are some articles that some other people have put out there, but Replicator is very well documented, has a lot, and used a lot, especially by people at Confluent, and I would say is truly production grade, where MirrorMaker is definitely production grade, but maybe not in the same way that Replicator is.
You've got a support contract, so you've got someone to shout at, if need be.
Exactly, exactly.
Which is important to some people, not so important to others.
Exactly, right. It's also nice when you can say that something is enterprise grade, and it's Replicator. Replicator is enterprise grade, and so that's-
Okay, okay. That's going to polarize the audience in different ways, but we'll move on.
I'm sure, I'm sure. So, the one nuance of Replicator that MirrorMaker takes care of very well is the offset translation, where the Replicator, the way it does its offset translation is a little bit different than the way MirrorMaker 2 does it. It's based on timestamps and uses a timestamp interceptor. I don't necessarily think this is a good time to get into the inner workings, but basically the way Replicator has been set up, it only translates offsets for Java clients. So, if you're a shop that uses .NET or uses Python or Librdkafka or any of the other client libraries that are out there that are fairly popular in the Kafka ecosystem, you can't leverage the Replicator consumer offset translation.
Oh, that seems like a big deal, because I use a few things that use Librdkafka.
So, that is a big gap. It's one of those things where it's great for people that only use Java or mostly use Java, but not great for customers that are fully .NET shops. That's one of those things where like we talked about before, you have to carefully strategize the stopping and restarting of your producers and consumers to make sure that you haven't produced a bunch or consumed a bunch of duplicates or missed any messages. Granted, there are some customers that don't necessarily have those restrictions. They're not super worried about missing messages or consuming duplicates. So, maybe it's not something that they need to worry about, but there are a lot of customers that do. So, it's one of the things that comes up a lot.
Definitely something to be aware of going in. Then, the third option is Cluster Linking, I believe.
Yes, it is Cluster Linking. It's the new Confluent best in class data replication tool that we have. It's actually offered as a fully managed service in Confluent Cloud as well. So, if you're using Confluent Cloud, it's offered as something that the Confluent Cloud engineers will support for you once you start running it. Cluster Linking is different than the way MirrorMaker and Replicator work in that it actually doesn't use the Connect framework. It kind of leverages the inner broker replication protocol that Kafka uses for moving data between leaders and followers.
Essentially, if you want to think about it from a high level, what Cluster Linking is doing is it's essentially setting up followers in read only topics on your destination that are replicating the data from the leaders in your source topic. Basically, your offsets are actually kept in sync between the two clusters, whereas with MirrorMaker and Replicator, since they're writing data new, the offsets aren't necessarily going to be the same, because retention is likely rolled off older offsets in your source cluster and things like that, whereas offset 1000 for a message on your source is going to be offset 1000 for the same message on your destination when using Cluster Linking.
Right, because it's pretending to be an extra replica of the original topic, right?
Pretty much, yeah.
Okay, that actually seems very logical to me. What's the downside?
So, the downside being it is a Confluent platform or a Confluent only product is one thing. Same with Replicator, but the other nuances are, at least with Cluster Linking, it only supports running against a source cluster of Apache Kafka version 2.4 and above. So, if you're on an older version of Kafka, you're not able to use Cluster Linking. I guess you have the option of upgrading, but at that point, you're then introducing even more complexity, so maybe you're better off using some of the other tools. The destination cluster needs to be Confluent Platform 7.0 for general availability or higher obviously or going to Confluent Cloud.
Yeah, I think we're on 7.3 at the moment. Is that right?
Yeah, exactly. Some of the other things that come up too, this is a little bit more specific to Confluent Cloud, but there are some networking considerations to think through where Cluster Linking doesn't always support moving data between two clusters for specific networking topologies. This is more for moving data between two Confluent Cloud clusters, things like private networking and stuff like that, where if you don't have a public endpoint, and Confluent Cloud isn't able to reach it, obviously the Cluster Link is not able to reach it either. So, there are some opportunities and some things we don't necessarily need to get into, but just to note, if you're looking to migrate from one Confluent Cloud cluster to another and you have private networking, Cluster Linking might not be a good fit, at least for now. That's slowly being shored up over time, but there are a few gaps there on the private networking side.
Speak to your friendly solutions architect like Michael.
Yeah, there you go. Exactly, exactly.
Okay, so yeah, okay, I see you've got a bunch of options, different places they would suit, right?
Yeah, and what Cluster Linking does too very well, sort of like MirrorMaker, it also does a very good job of replicating consumer offsets. Cluster Linking, the other difference too, like I think I said before, it creates mirror topics is what we call them in the destination that are read only, which differs from Replicator and MirrorMaker. Those read only topics have to actually be promoted to be read/write. The promotion is an extra step in the migration, but I think it also provides some added safeguards as well, because what the promotion does is it essentially will not allow that topic to be writeable until it has confirmed that all of the offsets on the source cluster have been replicated to the destination. So, it provides some added safeguards and can really make sure you're not missing any data when moving from one cluster to another.
Okay, so it makes sure you've got a fully in sync replica. Then, you switch over. You can write to it, but now it's stopped replicating. So, you're not accidentally doing both at once. Very important. Yes, that makes perfect sense. Okay, so assuming we're there and we've chosen the tool out of that pool of three that suits our use case, how do I get started on this?
Yeah, I think that that's a great question. First things first, you've got to spin up your other Kafka cluster. I think that that's not necessarily trivial, especially if you're moving to a new platform or a new service. For example, if you're going from OSK to MSK, you're going from MSK to Confluent Cloud, or wherever you're migrating, you might need to do a little bit of added learning to understand the nuances of the new technology that you're using. But, once you have that cluster available, it's actually fairly straightforward to deploy a lot of these things. Typically, it's just all based on a configuration file with Replicator and MirrorMaker. MirrorMaker, you create a properties file that defines all of your different configurations for the movement of data. Replicator is the same. You're essentially using either a properties file or also just a connector configuration where it also configures how you'd like to move that data.
Cluster Linking, again, is another sort of configuration file. There's a number of, in the documentation, different configurations that you can set to change the way that the data is moved and take advantage of some of the different features that are available with each of the solutions. But, once you've read through all of that and you've kind of decided, all right, hey, I want to do this with each of these solutions, you put it in a configuration file or a properties file. After it's been deployed, I always recommend do something in a lower environment. Test it out. Don't just go to production. I think anybody listening to this call knows that, but just an FYI. Don't just go to production.
Don't do it on a Friday evening, either.
Right, exactly. Don't do it on a weekend. Don't do it on a Friday evening. Honestly, don't do it. I don't like doing things on Mondays either. People's brains don't move quite as quickly on a Monday as they do on Tuesday or for the rest of the week.
It's terrible, but I've found that, too. Yeah, the middle three days are the safest.
They seem to be. Yeah, it's when everybody's coffee is really working the best. So, yeah, I think that once the tool is running, and you see the data flowing, I think some things that are very important are building in validation scripts, leveraging some of the Kafka open source clients that are available. Or, a lot of these tools expose some level of metrics, whether they be JMX metrics or metrics that are available in our Confluent Cloud metrics API where you can have a pretty good feel for how the data is flowing, if there is any sort of replication lag, things like that. I think it's very important. You don't have to go crazy with your monitoring solution, because again, with any migration, it is temporary. So, you don't want to put too much work into building dashboards, alerts, things like that. But, having some observability into the migration itself I think is very important. Leveraging the metrics that are available is key to really having that insight.
It's probably a good time to kick the tires on the monitoring for your new platform too, right?
Exactly. Oh, yeah. Because, at that point, if you're spinning things up, you want to definitely make sure that whatever monitoring solution that you've deployed for your existing cluster has been either replicated in some way, or at least you have a plan of doing so and getting the metrics upfront. Because, if you're testing this out, it's not super important. But, to your point, when you're about to go live with a production migration, you pretty much want to make sure that everything you've done in your original cluster has been taken care of in your new cluster and some.
I think, again, like we talked about before where you can do some topic repartitioning, you also have the opportunity to change the way your authentication works in your new cluster. Change the way the authorizations are done as well, whether you were using ACLs before or weren't, and you want to use ACLs. Confluent has role-based access control where we can use our back end and things like that in the newer cluster. Thinking through all the changes that you want to make and then implementing those in your new cluster prior to migrating the data and definitely prior to migrating your applications, I think is also very important as well, because you can really ...
Moving to a new cluster gives you the opportunity to really do a lot of new changes. It's not just moving the data. I think that taking this time is a real beneficial time to spruce up your existing implementation and make it maybe not more production grade, but more in line me with some of your enterprise security standards and things like that. Because, it can always be hard to do that on an existing cluster, because you don't want to introduce instability to running processes. If you do it on a new cluster and can do some testing to make sure that applications are still able to do what they're able to do and things like that, when you move things over, you can have that comfort that things are still going to work as they did on the original cluster, but now have all these new benefits that have been applied to your new cluster.
Yeah, yeah, and it's probably the only time in the lifetime of that cluster when you get to play an experiment and check with impunity, right?
Exactly, yeah. So, yeah, I always harp on my customers. If you want to make changes, this is the best time to do it. We're going to test them out. We'll make sure that they're viable, but this is the time.
Would you recommend doing, if you've got a test cluster in your old platform, would you recommend doing a full dry run migration of the test system?
Absolutely. Absolutely. Yeah, typically what I try to do is I try to find a set of canary applications, applications that are I guess lower impact, but they still cover the breadth of all the different application types and interdependencies that are currently running on the existing cluster. Once we have those identified, I actually like to even do those first in a production cluster too, as almost a production sanity check. These are low business impact. They still are a pretty good coverage of everything else that we're going to migrate. We can do these first, even in prod, make sure that all looks good, and then do the rest.
But, yeah, to your point, it's always good to do a full flown or full-fledged migration in really every environment, if you can. Dev may be not super important, because development data, you maybe don't really care to move your development data over. But, in a test environment or even your QAT environment, your quality testing environment, that data is probably still valuable. So, having at least one lower environment to fully go through your migration process and really test out your strategy is really important. Because, again, it's not just about moving the data. It's also about strategizing how you're going to effectively move your applications in a way that meets your business needs, meets SLAs, downtime requirements, all those good things.
Yeah. Well, that's something I was going to ask you. Do you get into the situation at all often where people migrate some topics and some services, embed that in for weeks, months, and then gradually piecemeal move things over?
Oh, yeah. I've seen all ends of the spectrum where some customers want to literally shut everything down, take an hour outage window, shut everything down, move everything over, validate, sign of approval, onto the next cluster. I've also worked with some customers that are doing literally an application a week. They have tens of applications, and it takes them six months to do the migration. But, that's what they are most comfortable doing, because their applications are heavily business critical. They just don't want to take really any downtime window. That was their choice.
I'm not one to say which one's better than the other. I'm just there to make sure that the data has been migrated effectively, and everything is running as it should on the new cluster. So, yeah, it's really dependent on the business. I think the key thing is understanding all the interdependencies of the applications when you're doing it in any way. Because, some apps feed into other apps which feed into other apps. You don't want to cut over one of the feed apps, if all your other applications are still running on the old cluster, because those guys aren't going to get their data.
I think that's from a strategy perspective, one of the things that we will spend a lot of time on, especially a really large implementation that a ton of different application teams that maybe you don't necessarily have control over are deploying their applications. If there's not good documentation with respect to what application depends on what data and what consumer group depends on what topic and vice versa, it's really important to take the time to get that ironed out. Because, especially if you're not just going to migrate everything at once, you need to understand what needs to all definitely be migrated at least in individual units.
Yeah, yeah. There's a chicken and egg situation in there, isn't there? You might be migrating because you want a platform with better stream data lineage tools, but you don't have those in your old platform, so you've got to just manually construct the stream lineage, right?
Pretty much, yeah. I've spent a lot of time, because you can use some of the tools that are out there, some of the CLI tools to see what consumer groups are going to what and what client IDs and producers are writing to what topics. But, it also requires that some of those configurations were set in those properties files upfront. If they weren't, it can be pretty hard to really tell what's writing to what. Again, it goes back to your point. That's why it's really important to do this elsewhere, because you might talk to a bunch of teams. They might give you a bunch of information, and you have to take what they say as the truth, because it's hard to get that insight. And, they're the ones that wrote their apps. If you find that in a lower environment, hey, you cut over this set, but some it looks a few were lingering that were missed, and the data is not flowing appropriately for a subset of applications, you want to definitely catch it before you go to production.
Yeah, yeah. It's very easy. There'll be that application that Dave wrote three years ago before he left, and no one's thought of it. Yeah.
Yeah, that's the other thing too, all the turnover of developers and stuff like that. It gets pretty challenging. If you're listening to this and you're planning a migration, start to get the documentation in place, because it'll be really helpful to have a nice map of all your interdependencies, especially if you have a fairly complex implementation.
Okay, so I've got my map, best effort map of what's going on. I've got a replica of all my topics in the new cluster. My naive strategy would be migrate all the readers, all the consumers first, then migrate all the writers, the producers once that's done. Is that roughly right? What am I missing?
Yeah, I think that that is roughly right. Again, it kind of goes back to the offset translation bit and if certain consumers can process duplicates, if they can miss messages, things like that. Then, that kind of comes back to the configurations on the consumer, because consumers have that auto offset reset configuration. If you're starting a consumer new with no offsets on the new cluster, that's what it's going to leverage. So, you can set it to earliest. It'll reproduce it or re-consume everything. Lower volume use cases, maybe that's okay. Higher volume use cases, you can see how, especially if there's a fair amount of topic retention, it might take that consumer a really long time to get caught back up. That might not be okay for the business. A lot of it comes down to understanding the business requirements for each application and truly both your downtime requirements, as well as your message processing requirements with respect to duplicates and missed messages. But, yeah, from a really high level, cutting over your readers is the right way to go.
That can also sometimes be challenging for some customers to take on, because especially if you're moving to a new way of authenticating, it requires a configuration change in all of your applications, not just flipping the bootstrap server to be something new, but also actually changing your Jazz configuration if you're going to SASL/PLAIN or if you're going to MTLS, having all the certificates in place and things like that. Especially if you have a lot of disparate application teams, getting to make these changes and be ready to go, that's always hard to coordinate. If you're manning the migration, it's sort of out of your control, and you just need to be as flexible as you can with these teams and set deadlines. I've worked with some customers where they've set a deadline, and they pretty much said, "Hey, if these teams haven't listened, we're going to cut over. If their stuff is broken, they need to jump online and fix it on the fly, because we can't really be in a situation where we're waiting on these teams that keep saying they're doing things, and they haven't done it."
This has reminded me a bit too much of getting my kids to school somehow.
Yeah, it's the same idea.
If you're not in the car in five minutes, I'm going without you.
Yeah, yeah. You're not going to do that with super business critical use cases, I would think, but there-
Yeah, and I'm not actually going to do it with a seven-year-old, but I think-
Right, right. It's same idea, because especially, that's why you want to do it in a lower environment too, because you can really see who's not listening and then be there, especially when you're ready to go to prod and know which doors to knock on and be like, "Hey, look, we have this problem in our QAT environment. Can you please tell me that you actually made these changes? If you didn't, that's okay, but we need to be sure that we're aware of that, so when we do the migration, we know what we can skip," that type of stuff.
Yeah. I guess you have the luxury of, you don't have to migrate all your consumer groups at once, right?
No, absolutely not. Yeah, and it all comes down to the interdependencies of things. At least from a consumer perspective, if you just have traditional producers and consumers, you can migrate your readers as needed. But, a lot of times, applications are written in a way where you've combined producing and consumers. I'm not even just talking really about Kafka Streams or KSQL, but some customers have intertwined producers and consumers all into the same single application. When you do it that way, it's kind of hard to cut that application over and just cut the consumer part over, because you then have to change your application build and cut that and split that up.
Even if they have no logical connection, they might be in the same binary.
Right, they literally might just be in the same jar or whatever has been built for whatever their application language is to essentially deploy that app. That also then comes down to dependencies as well, because the good thing is when data is flowing, you don't necessarily have to worry too, too much about the writers themselves, because the data is flowing. But, if you're a reader on the source, and the writer has been cut over to the destination, and you rely on that data being written by that producer, then that consumer needs to also be cut over at the same time.
So, that's really where you can get yourself into trouble is, if it's as easy as producers and consumers, cutting over all the readers first is the right way to go. When you have these intricate interdependencies and these applications that have built producers and consumers within the same sort of binary, then you really need to make sure that every reader that depends on the message is being written by those applications gets moved with the application that's writing the data. Because, data will be replicated. But, once you move that producer, it doesn't go the other way.
Yeah, you can't go back to where you were once you've moved the reader or the writer over.
Yeah, exactly. Yeah, and so then that reader will be kind of like, "Hey, where's my data?" That's where you can run into your own application problems. Again, it all comes back to having that map of who depends on what, who reads what, et cetera.
Okay, so going up the stack slightly from producers and consumers, what about Kafka streams applications?
Yeah, I think that that's a great question. Kafka Streams, KSQL in this case as well, the one good thing about a migration is you have full control over being able to stop your apps and ensure everything has been synchronized and then restart them on the destination. So, there's a pretty common recommendation that for disaster recovery, for example, when you're using Kafka Streams and you're using ksqlDB, since replication is asynchronous, you don't actually want to replicate the internal topics like the change logs and things like that and some of the internal topics created by Kafka Streams and KSQL and actually just replicate the input topics. Then, essentially restart your applications to rebuild all the state from the beginning on your destination. That's because since replication is asynchronous, there's no guarantee that all of the internal data that has been ... or all the internal topic data that's been written on your source has actually replicated the destination.
So, when you restart your apps over there, they're going to essentially not run into race conditions and things like that, because some topics may be more caught up than others, et cetera. Whereas-
Oh, I see, yeah.
During a migration, since you're in a situation where you can be really confident that all the data has moved from your source cluster to your destination, you actually do have the flexibility to migrate and mirror all of the data for all of the input topics, including the internal topics as well, and basically, essentially restart your K streams, your ksqlDB applications from where they left off, as long as you've kept the application ID and the KSQL service ID the same. The consumer group and all that stuff is aligned.
Okay. I can see that working for KSQL, but Kafka Streams, what if your Kafka Streams process has side effects like sending an email or something? You can't rerun that.
That's a good point. Yeah, if you're in a situation where your K Streams application has some of these side effects, like you put it, what we would typically recommend is since all of your input data is being replicated over to your destination cluster, it may actually be beneficial to get ahead of it and just run a parallel application that's not live and just have that new parallel K Streams application rebuild all its data and essentially catch up to where your source K Stream application is running. Then, all you really have to do is essentially cut over to the new K Streams application that's running that's essentially fully caught up on your destination. The migration is less about migrating all the internals and more about just having two parallel apps running and then just flipping a switch to make this your new live K Streams application, if you will, on your destination.
So, you're saying I would run the new live K Streams application without actually sending all the emails?
Right, yeah, you would turn off certain functionality in your standby K Streams app, but make sure everything is caught up. Then, when you're ready to switch it over, shut down all pieces of the K Streams application on the source, and then flip the switch to re-enable any of that additional functionality in your destination K Streams app.
Okay, that seems very doable, but a little bit of a gotcha.
It is.
Something to be aware of.
Yeah, it is something to be aware of, absolutely. Really, honestly any application, not even just K Streams, anything that has these side effects, if you will, sometimes you are in a situation where it might be more beneficial to run things in your blue/green deployment, if you will, where you have truly a parallel set of applications that are on standby where additional functionality is turned off, but they're fully caught up and running. So, you shut down everything on your source, then all you have to do is flip that additional functionality back online. Then, your destination or your new cluster, if you will, is really your live cluster. For customers that can't really afford any downtime or have to have as little downtime as possible, that's sometimes the way that things have to be done. You essentially need to just replicate all of the input data and rerun producers and rerun all of your applications, essentially anew on your new cluster.
After you replicate some of that original input data, you essentially start your producers, if you will, from the last point of replication, and then honestly stop your replication and literally have a parallel set of applications in the parallel cluster running. Because, a lot of times when you're producing, you can start from a specific timestamp. So, you might want to replicate all of your data up to a certain point in time or up to, I don't know, wherever you really can. So, you at least get the historical data so you're not in a situation where you're starting anew. Then, from there, you essentially have a parallel set of applications running on your secondary cluster. The cut over is really just shutting down your old and then making everything that depends on those apps now use your new set of applications. That's your true parallel deployment. I'd say probably the most expensive though, because you're running two parallel applications or two parallel clusters for X amount of time to do all your validations and stuff like that.
Yeah, and all your application machines. Yeah.
Exactly, but I think that might provide the most comfort for super highly critical use cases, because you could do all your validations and be sure everything is up to snuff on the new cluster before ever shutting anything down in your old cluster. But, everything has its pros and cons. But, I've seen some customers do it with really large implementations, though it's kind of hard for some people to swallow, because to your point, it's running everything in parallel. It's essentially a full separate setup. Depending on how long it might take to be comfortable cutting over, that's a lot of added cost.
Yeah, this is one of these things probably where you can say, "You can have it easy, cheap, safe. Pick two." Yeah. Okay, so we talked about Kafka Streams. We must also talk about connectors. I'm assuming that given that MirrorMaker 2 is based on connectors, connectors are fairly easy to migrate.
They are. They really are. I think that when you're fully self-managing your new cluster, you can just replicate your Connect internal topics as well as any of the input or target topics for your connector. So, your source connector writes to a topic. Sync connector reads from it. Then, when you shut down your connector and restart with the exact same configuration against your new cluster, it'll leverage those internal topics and essentially know where it left off and continue running as if nothing had changed. The one nuance to that is in Confluent Cloud, this is probably the only place where you don't actually have access to those Connect internal topics. I'm not sure if MSK Connect has the same sort of problem. I haven't looked into that in a while, but Confluent Cloud for sure, you don't have access to those internal topics.
Mirroring the internal topics doesn't really give you any benefit, if you're going to move from a self-managed connector to a fully managed connector in Confluent Cloud. Because, they use their own internal topics that are obfuscated from the customer. So, you need to really think through that strategy as well, because typically your source connectors will essentially start from the very beginning of whatever their source is and repopulate all that data. Sync connectors will restart from the earliest offset and re-consume all of that data. Some connectors have some configurations where you can start from specific timestamps. I think the DBZM CBC connectors, for example, they can actually just take a snapshot of the schema and then only start to replicate new changes as they come in. So, you don't get all the historical data, but you might miss some stuff that may have happened between the time when you've cut over and then restarted.
I think the GCS sync connector is maybe a good example where I think you can start from a specific timestamp of the ... Maybe that's the source connector, but the gist is there are some configurations in some of these connectors where you do have some flexibility to start from different points in time, but more often than not, you're stuck starting from the very beginning as when moving to a fully managed connector. So, that's something to keep in mind. But, for self-managed, it's really just replicating those internal topics. As long as you keep the configurations the exact same, when you move to your new Connect cluster and you move to your new Kafka cluster, everything should work as is.
Okay, and presumably you could migrate the cluster, stay self-managed on your connectors, and worry about that problem on another day.
It's very common. A lot of times when migrating any component, typically what customers will do is they'll just migrate Kafka and then re-point their existing self-managed Connect cluster to the new Kafka cluster, their existing self-managed schema registry to the new schema, the new Kafka cluster. Then, if they plan to migrate to a new Connect cluster or move to fully managed connectors or migrate to a new schema registry or move to a fully managed schema registry, they do that as an afterthought or a secondary phase. Because, you don't want to do too much at once. The core of your implementation is always going to be your data, so that should always be your priority. Then, the auxiliary components, the schemas and things like that can always follow suit.
Yeah. It may make someone's heart sink to think they've got to do three migration projects this year, but it's better than doing one huge one, right?
Right, yeah. With schemas, you do have the flexibility to sometimes do them at once. Replicator has a schema translation functionality that actually translates the schemas as you move from one schema registry cluster to another while you're replicating the data. So, you can sometimes take care of the schemas at the same time. Also, too, if you use schema aware serializers, you can have the new schemas be re-registered as the messages are de-serialized and then re-serialized in new cluster. There can be some added overhead there, but that's another way to take advantage of it, too. But, the thing to always keep in mind is a lot of these replication technologies use byte array serializers, where all it's doing is really just replicating the bytes. When you do that with any schema, you've got to make sure that the magic bytes align when you're re-consuming those datas downstream, and the magic byte is aligned with the schema ID. When you re-register schemas in the new cluster, you need to make sure that that old schema has the same ID in the new schema.
Okay, that's a serious gotcha.
Yes, and so there are some tools to do that using the schema registry. REST API is one of them where you can set it into import mode, and then you have full flexibility to align IDs between a old schema registry to a new schema registry. There's also, like I talked about before, Replicator has the schema translation where if you use the byte array serializer, it translates the schema and keeps the IDs aligned between the source and the destination. That makes sure that the de-serialization downstream still works as intended. But, I think some things that are typically not really thought through and are, I guess, fairly uncommon, but worth noting is not every schema registry works in the same way with respect to the way the magic byte is added to the Kafka message. The Cloudera Schema Registry, it actually writes the magic byte in a different way than the Kafka or Confluent Schema Registry.
If you were to move from Cloudera's to Confluent Schema Registry, for example, you can't just mirror the schema's topic or just do a direct import of the schema, because the way that the magic byte has been formatted, it will not align with the schema aware serializers in the Confluent cluster. So, that is a little bit more of a challenge. There are a few ways around it, but they're all going to be fairly custom, like writing a custom connector SMT, for example, that re-formats that magic byte to align with the format that the Confluent Schema Registry uses.
You can write an application that uses the Cloudera set of de-serializers and then re-serializes with the Kafka set to write it downstream, or the Confluent set rather to write it downstream. I don't want to get too into that, because it is fairly uncommon. But, if you are moving between newer or two separate enterprise schema registries, just make sure that the wire format, really the magic byte in this case, the way that they're written between the two schema registries aligned. If they don't, think through how you're going to make them line up, because that's one thing that can be forgotten. You'll run into that in a lower environment pretty quickly, so then it'll make you take a step back and-
Yeah, testing is going to show that up.
Figure it out. Exactly, yeah.
Okay. I suppose that also that kind of migration point, there's also an opportunity to do things like make different compression algorithm choices for your topics and things like that, right?
Absolutely. Yeah, if you're not using compression in your original cluster, but want to compress the messages as they get written downstream, you have the flexibility to. More with I think Replicator in this case and MirrorMaker than Cluster Linking. You can compress the messages as they get written to your new cluster. So, yeah, you have the full flexibility to do that there as well, because you can set the compression at the topic level. But, it's usually more common to compress the messages with the producer and then let the topic inherit whatever the producer compression is. Because, you need to be careful. If your producer is compressing with one algorithm, and your topic has a different one set, it compresses, then it has to decompress and then recompress before getting written to the Kafka topic. So, there's a fair amount of CPU overhead there as well. But, yeah, you're absolutely right. All of these different configuration changes, there's a fair amount of flexibility there when you're doing a migration.
Again, it all comes down to figuring out what you want to do and then what solutions are available to actually allow you to do it. Cluster Linking, for example, you won't be able to use any producing compression with Cluster Linking, because it's just literally a wire replication. It sets up a replica, like you said, and just moves the data over. It won't compress it that way. But, I guess you could have compression on the topic. We've never actually tried that out, but the problem is the mirror topic itself is a direct replica.
So, if you don't have compression on your topic already, that configuration, I don't believe you can set that in the destination. There is some work being done, I think, to allow some more customizations on the mirror topics to promote a little bit more of this with Cluster Linking. But, for now, I think there's a few configurations that can be changed. But, I think the most notable one is you can add a prefix. Your naming conventions don't necessarily need to be the same between your source cluster and your destination cluster when using Cluster Linking. So, yeah, you can use a prefix. Every mirror topic we'll have will be prefixed.
I can see people using that as a chance to tidy up a few rough edges that they've gained over the years, right?
Exactly, yeah. The documentation notes a lot of this as well, but there are some configurations that can be tweaked in your downstream topics that can differ from your upstream topics. But, it's not as many as if you were to use a solution like Replicator or a MirrorMaker for example. Or, again, you could change the topic partitioning and things like that.
Okay, yeah. Okay, I think I've got the overview. Stepping back maybe to start to put a cap on this, I'm wondering how long a project this tends to be, and which factors tend to influence the length of how long it takes?
Yeah, so I think from a length perspective, it definitely varies, as I'm sure you can imagine. I think that the strategy side of things is the most important. Once you really think through how you want to do it, and working with somebody like a professional services solutions architect or working with an expert that knows Kafka very well, I think is very important to really come up with the appropriate strategy. But, if you have a good understanding of your application interdependencies and have a good mapping or your data lineage and your data model and really understand it, spinning up the solution to migrate the data really doesn't take too much time. Once your cluster is available, and again, it depends on the amount of changes you want to make. If you need to create a bunch of ACLs in your new cluster and you didn't have those on the old cluster, that can take some time.
If you're introducing a new topic partitioning scheme, you might have to do some performance testing in your existing cluster to see, if you're reducing partitions, how much you can reduce it to without beginning to introduce performance implications and things like that. I think if we're just thinking, if we're not talking about making too many crazy changes, and it's just moving the data over, to get the solution set up and to get it well thought through really shouldn't take too much time. We're talking on the order of a week, maybe two weeks, just to get everything spun up and see the data move over. But, then I think the coordination of moving your application, things like that, that can obviously take some time.
That's going to be really case dependent, right?
Exactly. If you're the person that says, "Hey, I'm going to take a two-hour outage on a Saturday and shut everything down and re-point it, and I have full control over my applications," that might be doable. But, if you're in a situation where you don't have full complex or full control over those applications, you have to work with individual teams and really understand who can do what. Then, you're going to see added time there.
Yeah, depending on your organization, cajoling other departments can take an almost unlimited amount of time.
Yeah, yeah, and that's where I've seen a lot of trouble, too. Almost every customer has a very aggressive timeline. Hey, we're going to get 10 clusters done in seven weeks, and everything is going to be great. Seven months later, you're like, "So, where are we?" Well, we've got two clusters done. These disparate teams are strangling, this and that. It's just one of those things where you can only control what you can control.
Yeah, yeah. On this podcast, we often come to this, that we have no control of the human element. The best you can do is negotiate.
Yeah, that's exactly right.
Okay, well, I feel very reassured. Do you want to give me a parting thought or warning?
Yeah, so I will say I may have made this sound very easy on the podcast. From an implementation perspective, I think that it is. But, I think that just remember, all of the complexity is in your strategy and that if you have a well-thought-out strategy, you understand the tooling that's in place and what each one can do, and you've kind of bridged any networking gaps, and you understand all the changes you want to make, and you've taken all of that into consideration, that's where the true challenge is. The migration of the data itself, migrating applications, it's just switching configurations and just really rounding up a number of groups of people.
But, the true challenge in any migration is the strategy behind it and making sure that the tools that you use are going to be suitable for really what you're trying to do. At the end of the day, it just requires a little bit of reading and some trial and error or working with somebody that's done it before. Once you have a good strategy, I haven't ever failed with a good strategy. I can say that. Any failure I've had with a migration has been because the strategy has not been well-thought-out, or there were just some gaps, or somebody said they wanted to do something well after the fact that we were trying to migrate, things like that.
Okay, so for best chance of success, plan. Measure twice, cut once, as always
Exactly. That's the only thing you can do. Yeah.
Okay. Well, thank you very much, Michael. I'm not sure when this is going to come into my life immediately, but plenty of listeners, this will happen to them someday. So, it's nice to be aware.
It was my pleasure, and thank you for taking the time as well. It was good to see you.
He, cheers. We'll catch you again. Bye.
Until next time, take care.
Thank you, Michael. Now, if I'm being completely frank, whilst I found Michael's guide very thorough, very reassuring, I'm still a paranoid programmer at heart. So, I'd definitely be doing a few test runs before the final launch. But, I'm sure everything he said is going to come up, and I feel forewarned, and that's the most important thing. Before we go, if you want to learn more about Kafka from more gory details about how MirrorMaker and Cluster Linking work to the simpler, more day-to-day stuff like how you improve throughput or writing your first consumer, then check out Developer.Confluent.io, which is our free site to teach you everything we know about Apache Kafka.
Among many other accolades, it was recently voted the best source for learning Kafka by a comment I saw on Reddit yesterday. That's a pretty good endorsement. That will do to make me happy for the week. If you have your own comments, do get in touch. My contact details are always in the show notes. There are comment boxes there and like buttons and share buttons in your app to help you spread the love of this podcast. But, whatever buttons you have, whatever you do next, the most important thing is the opportunity to sit down and talk tech with interesting people. So, it remains for me to thank Michael Dunn for joining us and you for listening. I've been your host, Kris Jenkins, and I'll catch you next time.
Migrating Apache Kafka® clusters can be challenging, especially when moving large amounts of data while minimizing downtime. Michael Dunn (Solutions Architect, Confluent) has worked in the data space for many years, designing and managing systems to support high-volume applications. He has helped many organizations strategize, design, and implement successful Kafka cluster migrations between different environments. In this episode, Michael shares some tips about Kafka cluster migration with Kris, including the pros and cons of the different tools he recommends.
Michael explains that there are many reasons why companies migrate their Kafka clusters. For example, they may want to modernize their platforms, move to a self-hosted cloud server, or consolidate clusters. He tells Kris that creating a plan and selecting the right tool before getting started is critical for reducing downtime and minimizing migration risks.
The good news is that a few tools can facilitate moving large amounts of data, topics, schemas, applications, connectors, and everything else from one Apache Kafka cluster to another.
Kafka MirrorMaker/MirrorMaker2 (MM2) is a stand-alone tool for copying data between two Kafka clusters. It uses source and sink connectors to replicate topics from a source cluster into the destination cluster.
Confluent Replicator allows you to replicate data from one Kafka cluster to another. Replicator is similar to MM2, but the difference is that it’s been battle-tested.
Cluster Linking is a powerful tool offered by Confluent that allows you to mirror topics from an Apache Kafka 2.4/Confluent Platform 5.4 source cluster to a Confluent Platform 7+ cluster in a read-only state, and is available as a fully-managed service in Confluent Cloud.
At the end of the day, Michael stresses that coupled with a well-thought-out strategy and the right tool, Kafka cluster migration can be relatively painless. Following his advice, you should be able to keep your system healthy and stable before and after the migration is complete.
EPISODE LINKS
If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.
Email Us