What is Kafka Connect, and how does it work? Find answers for all the most commonly asked questions about Kafka integration and the connector ecosystem.
Kafka Connect is a tool that provides integration for Kafka with other systems, both sending to and receiving data from them. It is part of Apache Kafka. Kafka Connect is configuration-driven—–you don’t need to write any code to use it.
Kafka Connect manages crucial components of scalable and resilient integration including:
With Kafka Connect, you can use hundreds of existing connectors oru you can write your own connectors. You can use Kafka Connect with managed connectors in Confluent Cloud or run it yourself. Kafka Connect is deployed as its own process (known as a worker), separate from the Kafka brokers.
Learn more about Kafka Connect in this free course.
Kafka Connect is used for integrating other external systems with Kafka. This includes:
_Learn more about Kafka Connect in a free course.
The component that runs connectors in Kafka Connect is known as a worker.
Many connectors are also available as a fully managed service in Confluent Cloud.
You can run Kafka Connect yourself or use it as a fully managed service in Confluent Cloud.
If you are running Kafka Connect yourself, there are two steps to creating a connector in Kafka Connect:
Use the REST API to create an instance of a connector:
curl -X PUT -H "Content-Type:application/json" http://localhost:8083/connectors/sink-elastic-01/config \
"topics" : "orders",
"connection.url" : "http://elasticsearch:9200",
"type.name" : "_doc",
"key.ignore" : "false",
"schema.ignore" : "true"
To use Kafka Connect on Confluent Cloud you can use the web interface to select and configure the connector that you want to use. There is also a CLI and API for managed connectors on Confluent Cloud.
The specific configuration elements will vary for each connector.
Kafka can easily integrate with a number of external databases through Kafka Connect. Depending on the data source, Kafka Connect can be configured to stream both incremental database changes or entire databases row-by-row into Kafka.
Change Data Capture (CDC) can easily be done using a Kafka Connect source connector. It works with Kafka Connect by monitoring a database, recording changes, and streaming those changes into a Kafka topic for downstream systems to react to. Depending on your source database, there are a number of Kafka Connectors available, including but not limited to MySQL (Debezium), Oracle, and MongoDB (Debezium).
Some of these connectors are built in conjunction with Debezium, an open-source CDC tool.
Learn more in this free training course.
If you're using Confluent Cloud you can take advantage of the managed connectors provided.
The process of installing Kafka Connect is relatively flexible so long as you have access to a set of Kafka Brokers. These brokers can be self-managed, or brokers on a cloud service such as Confluent Cloud.
Workers – the components that run connectors in Kafka Connect – can be installed on any machines that have access to Kafka brokers.
Kafka Connect can be installed:
XML data are just formatted strings, so they're quite at home in the value field of a Kafka message. If you want to move XML files into Kafka, there are a number of connectors available to you:
This hands-on course will show you how to build event-driven applications with Spring Boot and Kafka Streams.