Get Started Free

Apache Kafka® FAQs

Here are some of the questions that you may have about Apache Kafka and its surrounding ecosystem.

If you’ve got a question that isn’t answered here then please do ask the community.

What is Kafka Connect?

Kafka Connect is a tool that provides integration for Kafka with other systems, both sending to and receiving data from them. It is part of Apache Kafka. Kafka Connect is configuration-driven—–you don’t need to write any code to use it.

Kafka Connect manages crucial components of scalable and resilient integration including:

  • Offset tracking
  • Restarts
  • Schema handling
  • Scale out

With Kafka Connect, you can use hundreds of existing connectors oru you can write your own connectors. You can use Kafka Connect with managed connectors in Confluent Cloud or run it yourself. Kafka Connect is deployed as its own process (known as a worker), separate from the Kafka brokers.

Learn more about Kafka Connect in this free course.

What is Kafka Connect used for?

Kafka Connect is used for integrating other external systems with Kafka. This includes:

  • Database CDC (snapshotting an entire database table into Kafka, then sending every subsequent change to that table)
  • Streaming data from a message queue such as ActiveMQ or RabbitMQ into Kafka
  • Pushing data from a Kafka topic to a cloud data warehouse such as Snowflake or BigQuery
  • Streaming data to NoSQL stores like MongoDB or Redis from Kafka

_Learn more about Kafka Connect in a free course.

How do I run Kafka Connect?

The component that runs connectors in Kafka Connect is known as a worker.

Kafka Connect workers can be deployed on bare metal, Docker, Kubernetes, etc. Here's how you'd run it directly from the default Kafka installation:

./bin/connect-distributed ./etc/kafka/connect-distributed.properties

Many connectors are also available as a fully managed service in Confluent Cloud.

How do I use Kafka Connect?

You can run Kafka Connect yourself or use it as a fully managed service in Confluent Cloud.

If you are running Kafka Connect yourself, there are two steps to creating a connector in Kafka Connect:

  1. Run your Kafka Connect worker. Kafka Connect workers can be deployed on bare metal, Docker, Kubernetes, etc. Here's how you'd run it directly from the default Kafka installation:

    ./bin/connect-distributed ./etc/kafka/connect-distributed.properties
  2. Use the REST API to create an instance of a connector:

    curl -X PUT -H  "Content-Type:application/json" http://localhost:8083/connectors/sink-elastic-01/config \
        -d '{
        "connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
        "topics"         : "orders",
        "connection.url" : "http://elasticsearch:9200",
        "type.name"      : "_doc",
        "key.ignore"     : "false",
        "schema.ignore"  : "true"
        }'

To use Kafka Connect on Confluent Cloud you can use the web interface to select and configure the connector that you want to use. There is also a CLI and API for managed connectors on Confluent Cloud.

The specific configuration elements will vary for each connector.

Can Kafka stream from databases?

Kafka can easily integrate with a number of external databases through Kafka Connect. Depending on the data source, Kafka Connect can be configured to stream both incremental database changes or entire databases row-by-row into Kafka.

Learn more about it in this module of the free Kafka Connect pipelines course.

How to do Change Data Capture (CDC) with Kafka

Change Data Capture (CDC) can easily be done using a Kafka Connect source connector. It works with Kafka Connect by monitoring a database, recording changes, and streaming those changes into a Kafka topic for downstream systems to react to. Depending on your source database, there are a number of Kafka Connectors available, including but not limited to MySQL (Debezium), Oracle, and MongoDB (Debezium).

Some of these connectors are built in conjunction with Debezium, an open-source CDC tool.

Learn more in this free training course.

How do I install Kafka Connect?

If you're using Confluent Cloud you can take advantage of the managed connectors provided.

The process of installing Kafka Connect is relatively flexible so long as you have access to a set of Kafka Brokers. These brokers can be self-managed, or brokers on a cloud service such as Confluent Cloud.

Workers – the components that run connectors in Kafka Connect – can be installed on any machines that have access to Kafka brokers.

Kafka Connect can be installed:

  • on bare metal machines in either a standalone (one Kafka Connect instance) or distributed (multiple Kafka Connect instances forming a cluster) modes;
  • in containers using Docker, Kubernetes, etc.

After installing the Kafka Connect worker you will need to install Kafka Connect plugins such as connectors and transformers.

How to read XML file into Kafka

XML data are just formatted strings, so they're quite at home in the value field of a Kafka message. If you want to move XML files into Kafka, there are a number of connectors available to you:

See the Quick Start for more information, as well as the deep-dive blog Ingesting XML data into Kafka.

How to send syslog data to Kafka

Syslog data can be ingested into Kafka easily using the Syslog Source Connector.

For more information on how to get started with your syslog data in Kafka, check out this blog post!

How to load CSV data into Kafka

CSV files can be parsed and read into Kafka using Kafka Connect. There are several connectors that can do the job:

See the Quick Start for more information.

Learn more with these free training courses

Apache Kafka 101

Learn how Kafka works, how to use it, and how to get started.

Spring Framework and Apache Kafka®

This hands-on course will show you how to build event-driven applications with Spring Boot and Kafka Streams.

Building Data Pipelines with Apache Kafka® and Confluent

Build a scalable, streaming data pipeline in under 20 minutes using Kafka and Confluent.

Confluent Cloud is a fully managed Apache Kafka service available on all three major clouds. Try it for free today.

Try it for free