Senior Curriculum Developer
Learn the high-level goals of a hybrid cloud architecture and the advantages it brings, including flexibility, compliance and security, and developer efficiency.
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.
Before we get into the primary use cases for creating hybrid and multi-cloud architectures, I wanted to review three high-level goals to keep in mind as we think about each of the use cases outlined in this module.
First, your hybrid cloud architecture should be flexible.
Good hybrid cloud architecture allows your systems to scale up or down based on demand, allows life’s issues and downtime to happen without significantly impacting your systems and users, and ties all your systems together with the ability to share data.
The second is to meet compliance and regulatory requirements. Part of having a system that spans the globe means that you need to adhere to local rules and regulations for your user's data.
Being able to meet those requirements while still sharing key aspects of your architecture needs to be a primary feature.
Third, workload management.
As we mentioned earlier in the course, there are some aspects of your architecture that might never be cloud-based.
We need to simplify connecting the on-premise with the cloud.
So that your overall architecture is manageable and doesn’t require as many special use cases and exceptions.
Now that we’ve talked about our goals, let’s talk about the multiple use cases for setting up and configuring a multi-cloud or hybrid architecture.
While there are additional use cases that could be considered, for the most part, they could be simplified into these scenarios.
Each one follows a similar pattern, create connections between multiple clusters and then build the infrastructure depending on the use case you care about.
While this module doesn’t go into great detail on any of these scenarios, I think it’s important to present the high-level use case and point you toward resources that will help you continue down the path.
The scenarios we’ll talk about in this module are:
Global access to data, Cloud Migration and Modernization, Data sharing, and last High Availability and Disaster recovery.
Let’s take a look at what each of these entails, and how MirrorMaker 2, Confluent Replicator, and Cluster Linking handle each scenario.
In a lot of ways the world has gotten smaller as technology has advanced.
However, the issues facing most organizations in their architecture haven't.
Whether it’s an acquisition, older hardware that is tied to a datacenter or a certain location, or trying to aggregate data from multiple cloud providers.
The reality is that managing all the different options quickly becomes a full-time job.
What we need is a centralized way to manage our architecture no matter where or on which platform the data resides.
Global access to data makes sure that our customers or employees have the access they need.
Latency and the speed at which your customers or employees have access to data can be a deal breaker.
The goal is to ensure that your data is accessible from multiple locations across the globe.
So your customers or employees can expect software to behave in a certain way regardless of if they are at home, or traveling in a different country.
This allows you to unify your data from every region to create a global real-time streaming architecture.
Both Confluent Replicator and MirrorMaker 2 have similar issues.
For each location, you need to set up a Connect cluster to manage and tweak.
This means that scaling becomes more difficult, and any configuration changes become more labor-intensive and costly.
With either Replicator or MirrorMaker 2 you also have to decide on where to initiate the connection, either at the source where you’ll have to tune and tweak settings based on performance or at the destination where you’ll need an exception in your firewall rules.
Neither option is ideal.
Cluster Linking eliminates this burden since the brokers just talk to each other.
Cloud-native geo-replication is built into the brokers so that you can easily scale operations, in addition to your data.
Cloud Migration and Modernization refers to moving from an on-premise Kafka or Confluent Platform cluster to a Cloud-hosted platform.
This could also encompass migrating from an older infrastructure to new infrastructure.
Since the goal of Cloud Migration and Modernization is to transfer the data, running connectors, ACLs, streaming applications, ksqlDB streams and other aspects of your cluster, many steps and processes will be the same no matter which option you select.
The main benefit of using Cluster Linking is you get exact replicas of your topics so your consumers can resume at the correct offset.
This enables you to get your system up and running faster without worrying about duplicates.
However, due to the exact replicas, you won't be able to change the naming convention of your topics.
In this case, your best bet would be to use something like Confluent Replicator.
If you are interested in learning more about cluster, or data migration, be sure to check out the Confluent documentation for a deeper dive.
Data Sharing allows different teams, lines of business, or organizations to have access to the data they need while isolating data they shouldn’t access.
Replicating your data and topics in a way that allows teams to access their own data set without creating a complex, slow, database that doesn’t scale with your streaming data.
It can be easier and more secure to provide a cluster that doesn’t contain data that your customers or employees shouldn’t access.
A good example of data sharing is among the US Government agencies that are required to share large volumes of data to enable them to execute on their critical missions.
Sharing data across agencies is required for implementing the US immigration and naturalization processes, issuing passports and Visas, and assimilating border migrants, asylum seekers, and refugees into the US.
They also have user data that can’t be transferred outside of the country for regulatory purposes.
Being able to select the topics they want to share and only sharing those allows the government to easily maintain compliance.
The challenge they faced was that the current processes for sharing data was not real-time and required agencies to establish a set of queries to gain access to data through request-response models.
In addition, the majority of data was stored in legacy transactional systems.
Therefore, US government agencies were required to access or replicate the data from the transactional systems.
That being said, the data was so siloed and the systems were so brittle that US government agencies found themselves being asked to leverage spreadsheets to share data in times of crisis such as during the Afghanistan airlift of 2021.
Obviously not a real-time, scalable, or even secure system.
This situation was discussed in more detail during a session of Current 2022, and I highly recommend checking it out if you are interested.
Even without watching the session, you can see how setting up a realtime streaming system that only shares isolated data allows the Government to maintain compliance and improve the workflow for employees and departments.
Be sure to check out this fantastic tutorial on how to share your data across Clusters, Regions, and Clouds on the Confluent Documentation site.
By the end of the tutorial, you will have configured two clusters using Cluster Linking to create a mirror topics and share topic data across the clusters.
You will also learn how to stop mirroring to make the topic writable and verify that the two topics have diverged.
Last, but not least is High Availability and Disaster Recovery.
The goal of this type of cluster is to provide failover should your primary cluster experience an outage or disaster.
The more robust your Disaster Recovery plan is, the faster your business can get back to creating and processing data.
A disaster recovery plan often requires multi-datacenter Apache Kafka deployments where data centers are geographically dispersed.
If disaster strikes—catastrophic hardware failure, software failure, power outage, denial of service attack, or any other event that causes one data center to completely fail—Kafka continues running in another data center until service is restored.
A multi-datacenter solution with a disaster recovery plan ensures that your event streaming applications continue to run even if one data center fails.
While all three options can replicate data, it can be more difficult to set up Replicator and MirrorMaker 2 as both require a greater amount of overhead in managing the connectors, whether that’s updates to your infrastructure or configuration changes that need to be made within each cluster.
As an example, Offset translation in Confluent Replicator is only for Java consumers and does not work for other types of applications.
Since Cluster Linking allows a direct connection between your clusters with offsets preserved and byte-for-byte data replication it can be easier to failover to your backup cluster.
I highly recommend taking a look at the documentation to read more about disaster recovery as well as the provided tutorial that walks you through setting up and testing a disaster recovery scenario.
In the next module, we’ll dive deeper into setting up high availability or disaster recovery using Cluster linking.
If you’ve made it this far in the video you might have noticed that most of our use cases are best solved by Cluster Linking.
For most organizations Cluster Linking will save you time and effort, and in the end money. I highly recommend that you take a look at the hands-on sections to get a better feel for how each solution is set up and configured.
Then configure a cluster yourself using your own Confluent Cloud account and the additional credits provided with this course.
You’ll find the hands-on sections on developer.confluent.io, using the link in the video description.
If you aren't already on Confluent Developer head there now using the link in the video description to access the rest of this course, the hands-on exercises, and additional resources.