What is Kafka Connect, and how does it work? Find answers for all the most commonly asked questions about Kafka integration and the connector ecosystem.
Kafka Connect is a tool that provides integration for Kafka with other systems, both sending to and receiving data from them. It is part of Apache Kafka. Kafka Connect is configuration-driven—–you don’t need to write any code to use it.
Kafka Connect manages crucial components of scalable and resilient integration including:
With Kafka Connect, you can use hundreds of existing connectors oru you can write your own connectors. You can use Kafka Connect with managed connectors in Confluent Cloud or run it yourself. Kafka Connect is deployed as its own process (known as a worker), separate from the Kafka brokers.
Learn more about Kafka Connect in this free course.
Kafka Connect is used for integrating other external systems with Kafka. This includes:
_Learn more about Kafka Connect in a free course.
The component that runs connectors in Kafka Connect is known as a worker.
Kafka Connect workers can be deployed on bare metal, Docker, Kubernetes, etc. Here's how you'd run it directly from the default Kafka installation:
./bin/connect-distributed ./etc/kafka/connect-distributed.properties
Many connectors are also available as a fully managed service in Confluent Cloud.
You can run Kafka Connect yourself or use it as a fully managed service in Confluent Cloud.
If you are running Kafka Connect yourself, there are two steps to creating a connector in Kafka Connect:
Run your Kafka Connect worker. Kafka Connect workers can be deployed on bare metal, Docker, Kubernetes, etc. Here's how you'd run it directly from the default Kafka installation:
./bin/connect-distributed ./etc/kafka/connect-distributed.properties
Use the REST API to create an instance of a connector:
curl -X PUT -H "Content-Type:application/json" http://localhost:8083/connectors/sink-elastic-01/config \
-d '{
"connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
"topics" : "orders",
"connection.url" : "http://elasticsearch:9200",
"type.name" : "_doc",
"key.ignore" : "false",
"schema.ignore" : "true"
}'
To use Kafka Connect on Confluent Cloud you can use the web interface to select and configure the connector that you want to use. There is also a CLI and API for managed connectors on Confluent Cloud.
The specific configuration elements will vary for each connector.
Kafka can easily integrate with a number of external databases through Kafka Connect. Depending on the data source, Kafka Connect can be configured to stream both incremental database changes or entire databases row-by-row into Kafka.
Learn more about it in this module of the free Kafka Connect pipelines course.
Change Data Capture (CDC) can easily be done using a Kafka Connect source connector. It works with Kafka Connect by monitoring a database, recording changes, and streaming those changes into a Kafka topic for downstream systems to react to. Depending on your source database, there are a number of Kafka Connectors available, including but not limited to MySQL (Debezium), Oracle, and MongoDB (Debezium).
Some of these connectors are built in conjunction with Debezium, an open-source CDC tool.
Learn more in this free training course.
If you're using Confluent Cloud you can take advantage of the managed connectors provided.
The process of installing Kafka Connect is relatively flexible so long as you have access to a set of Kafka Brokers. These brokers can be self-managed, or brokers on a cloud service such as Confluent Cloud.
Workers – the components that run connectors in Kafka Connect – can be installed on any machines that have access to Kafka brokers.
Kafka Connect can be installed:
After installing the Kafka Connect worker you will need to install Kafka Connect plugins such as connectors and transformers.
XML data are just formatted strings, so they're quite at home in the value field of a Kafka message. If you want to move XML files into Kafka, there are a number of connectors available to you:
See the Quick Start for more information, as well as the deep-dive blog Ingesting XML data into Kafka.
Syslog data can be ingested into Kafka easily using the Syslog Source Connector.
For more information on how to get started with your syslog data in Kafka, check out this blog post!
CSV files can be parsed and read into Kafka using Kafka Connect. There are several connectors that can do the job:
See the Quick Start for more information.
Learn how Kafka works, how to use it, and how to get started.
This hands-on course will show you how to build event-driven applications with Spring Boot and Kafka Streams.
Build a scalable, streaming data pipeline in under 20 minutes using Kafka and Confluent.