Senior Curriculum Developer
Understand the importance of having a hybrid cloud and the four main reasons:
Have you ever gone to eat soup and found that all you had were clean forks?
Or a salad when all you have is a clean spoon?
As a child, I remember convincing myself that eating soup with a fork was a better decision than taking the time and effort to clean a spoon.
Needless to say, I didn’t last long before I just started drinking the soup directly from the bowl.
The same is true for selecting our architecture, we need to select the right tool for the job.
Before we get into talking about the different options for your hybrid and multi-cloud architecture,
I think it’s important that we understand some of the current approaches enterprises have been using in their hybrid and multi-cloud architectures.
Most organizations use existing tools and processes to move data between on-premise and cloud environments.
This includes batch data transfer, ETL tools, existing messaging systems, APIs, and lots of “do it yourself” custom engineering work.
These approaches result in a lot of point-to-point interconnections between systems within and across environments.
Different teams across different business lines usually decide on their own how to design data pipelines for each project, with no standard processes or tooling across the organization.
Periodic batch jobs mean different copies of data exist in different places at different times and stale information is being used to run the business.
Expensive and valuable engineering resources are wasted on data pipeline projects instead of focusing on building new capabilities for customers or internal stakeholders.
Fragile point-to-point interconnections between applications, data stores, and systems each need to be networked, secured, monitored, maintained, and can all easily break.
Security and data governance challenges increase as more interconnections are established and as disparate teams, roll their own solutions for data movement.
Adding new cloud services makes these problems even worse.
As more interconnections are added complex new cloud networking and security issues emerge along with additional compliance and data laws that must be addressed across all global regions.
All of the architectural challenges enterprises were already facing with their on-premise architecture get amplified with the addition of new cloud environments,
and some new cloud-specific challenges arise as well.
This means innovation slows down, costs and risk increase, architectures become more brittle, and securing everything gets much more difficult.
All of this friction hinders organizations from realizing the benefits they were after by moving to the cloud in the first place.
This is where having a hybrid and multi-cloud architecture comes in.
Having a central nervous system that keeps the information in sync, no matter where it’s located, the platform it uses, or where your users are located.
There are four main reasons organizations set up and configure their hybrid and multi-cloud architectures.
Global access to data, Cloud Migration and Modernization, data sharing, and disaster recovery.
Most other use cases fall into one of these categories.
Global access to data, also sometimes referred to as Global Replication, ensures that your data is accessible from multiple locations across the globe.
This allows you to unify your data from every region to create a global real-time streaming architecture.
You can think of this as a type of content delivery network, or CDN, for your Kafka events.
Cloud migration and modernization refers to moving from an on-premise Kafka or Confluent Platform cluster to a Cloud-hosted platform.
This could also encompass migrating from an older infrastructure to a new infrastructure or modernizing your existing architecture.
Data Sharing allows different teams, lines of business, or organizations to have access to the data they need while isolating data they shouldn’t access.
Replicating your data and topics in a way that allows teams to access their own data set without creating a complex, slow, database that doesn’t scale with your streaming data.
An example of this might be protecting your Tier 1, customer-facing applications and workloads from disruption by creating a read-replica cluster for lower-priority applications and workloads,
giving access to the same data without reading from the same system.
A Disaster Recovery cluster provides failover should your primary cluster experience a data center failure, regional outage, network issue, or another type of disaster.
The more robust your Data Recovery plan is, the faster your business can get back up to creating and processing data.
One of the most powerful features of Cluster Linking, Confluent Replicator, and MirrorMaker 2 is their support of replication patterns.
These replication patterns allow you to take your data and customize it for your architecture, customers, and internal employees.
Active/Active or high availability deployments is where Cluster A replicates to Cluster B, and Cluster B replicates to Cluster A.
Active/Passive or Active/Standby is where cluster A replicates to cluster B and B serves as a Passive or standby cluster often used in disaster recovery and high availability scenarios.
Aggregation, or many to one is where you have one cluster that is aggregating data from multiple clusters, represented here by cluster K.
Fan-out or one to many is where one cluster is replicated to multiple other clusters.
And forwarding: where A replicates to B, B to C, and C to D.
All three solutions we’ll talk about in this course, MirrorMaker 2, Confluent Replicator, and Cluster Linking provide solutions to these scenarios.
Albeit in slightly different ways.
In the next few modules, we’ll introduce each solution and talk about some of the benefits and considerations you need to keep in mind.
With the goal of providing you with the knowledge to pick the one that works for your scenario and your goals.
If you aren't already on Confluent Developer head there now using the link in the video description to access the rest of this course, the hands-on exercises, and additional resources.
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.