course: Hybrid and Multicloud Architecture with Apache Kafka

Understanding Hybrid and Multicloud Architectures

6 min

Dan Weston

Senior Curriculum Developer

Understand the importance of having a hybrid cloud and the four main reasons:

Global Access to Data
Cloud Migration and Modernization
Data Sharing
Disaster Recovery

Do you have questions or comments? Join us in the #confluent-developer community Slack channel to engage in discussions with the creators of this content.

Use the promo code HYBRCLOUD101 & CONFLUENTDEV1 to get $25 of free Confluent Cloud usage and skip credit card entry.

Get Started

Understanding Hybrid and Multicloud Architectures

Have you ever gone to eat soup and found that all you had were clean forks?

Or a salad when all you have is a clean spoon?

As a child, I remember convincing myself that eating soup with a fork was a better decision than taking the time and effort to clean a spoon.

Needless to say, I didn’t last long before I just started drinking the soup directly from the bowl.

The same is true for selecting our architecture, we need to select the right tool for the job.

Before we get into talking about the different options for your hybrid and multi-cloud architecture,

I think it’s important that we understand some of the current approaches enterprises have been using in their hybrid and multi-cloud architectures.

Most organizations use existing tools and processes to move data between on-premise and cloud environments.

This includes batch data transfer, ETL tools, existing messaging systems, APIs, and lots of “do it yourself” custom engineering work.

These approaches result in a lot of point-to-point interconnections between systems within and across environments.

Different teams across different business lines usually decide on their own how to design data pipelines for each project, with no standard processes or tooling across the organization.

Periodic batch jobs mean different copies of data exist in different places at different times and stale information is being used to run the business.

Expensive and valuable engineering resources are wasted on data pipeline projects instead of focusing on building new capabilities for customers or internal stakeholders.

Fragile point-to-point interconnections between applications, data stores, and systems each need to be networked, secured, monitored, maintained, and can all easily break.

Security and data governance challenges increase as more interconnections are established and as disparate teams, roll their own solutions for data movement.

Adding new cloud services makes these problems even worse.

As more interconnections are added complex new cloud networking and security issues emerge along with additional compliance and data laws that must be addressed across all global regions.

All of the architectural challenges enterprises were already facing with their on-premise architecture get amplified with the addition of new cloud environments,

and some new cloud-specific challenges arise as well.

This means innovation slows down, costs and risk increase, architectures become more brittle, and securing everything gets much more difficult.

All of this friction hinders organizations from realizing the benefits they were after by moving to the cloud in the first place.

This is where having a hybrid and multi-cloud architecture comes in.

Having a central nervous system that keeps the information in sync, no matter where it’s located, the platform it uses, or where your users are located.

There are four main reasons organizations set up and configure their hybrid and multi-cloud architectures.

Global access to data, Cloud Migration and Modernization, data sharing, and disaster recovery.

Most other use cases fall into one of these categories.

Global access to data, also sometimes referred to as Global Replication, ensures that your data is accessible from multiple locations across the globe.

This allows you to unify your data from every region to create a global real-time streaming architecture.

You can think of this as a type of content delivery network, or CDN, for your Kafka events.

Cloud migration and modernization refers to moving from an on-premise Kafka or Confluent Platform cluster to a Cloud-hosted platform.

This could also encompass migrating from an older infrastructure to a new infrastructure or modernizing your existing architecture.

Data Sharing allows different teams, lines of business, or organizations to have access to the data they need while isolating data they shouldn’t access.

Replicating your data and topics in a way that allows teams to access their own data set without creating a complex, slow, database that doesn’t scale with your streaming data.

An example of this might be protecting your Tier 1, customer-facing applications and workloads from disruption by creating a read-replica cluster for lower-priority applications and workloads,

giving access to the same data without reading from the same system.

A Disaster Recovery cluster provides failover should your primary cluster experience a data center failure, regional outage, network issue, or another type of disaster.

The more robust your Data Recovery plan is, the faster your business can get back up to creating and processing data.

One of the most powerful features of Cluster Linking, Confluent Replicator, and MirrorMaker 2 is their support of replication patterns.

These replication patterns allow you to take your data and customize it for your architecture, customers, and internal employees.

Active/Active or high availability deployments is where Cluster A replicates to Cluster B, and Cluster B replicates to Cluster A.

Active/Passive or Active/Standby is where cluster A replicates to cluster B and B serves as a Passive or standby cluster often used in disaster recovery and high availability scenarios.

Aggregation, or many to one is where you have one cluster that is aggregating data from multiple clusters, represented here by cluster K.

Fan-out or one to many is where one cluster is replicated to multiple other clusters.

And forwarding: where A replicates to B, B to C, and C to D.

All three solutions we’ll talk about in this course, MirrorMaker 2, Confluent Replicator, and Cluster Linking provide solutions to these scenarios.

Albeit in slightly different ways.

In the next few modules, we’ll introduce each solution and talk about some of the benefits and considerations you need to keep in mind.

With the goal of providing you with the knowledge to pick the one that works for your scenario and your goals.

If you aren't already on Confluent Developer head there now using the link in the video description to access the rest of this course, the hands-on exercises, and additional resources.

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Language Guides

Tutorials

Demos

Language Guides

Tutorials

Demos

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

Modules: Start from lesson 1
Total 12