Course: Kafka Connect 101

Troubleshooting Kafka Connect Common Issues and How to Debug Them

8 min
Tim BerglundSr. Director, Developer Advocacy (Course Presenter)
Robin MoffattStaff Developer Advocate (Course Author)

Troubleshooting Kafka Connect

Given that Kafka Connect is a data integration framework, troubleshooting is just a necessary part of using it. This has nothing to do with Connect being finicky (on the contrary, it’s very stable). Rather, there are keys and secrets, hostnames, and table names to get right. Then there are the external systems that you are integrating with, each of which needs to be visible and accessible by Connect, and each of which has its own security model. Basically, if you’ve done any integration work in the past, the situation is familiar.

A Troubleshooting Scenario

Your Connect worker is running, your source connector is running—but no data is being ingested.

Because connectors are comprised of tasks, one of the first things you should consider is that one or more of its tasks has failed, independently of the connector. To verify this, you’ll need to gather more information from the Connect API.

kafka-connect-troubleshooting

Getting Task Status

Per the command in the image below, you can use curl to get task status and pipe the results through jq (a remarkably capable JSON formatter). The command as written requests the stack trace for the first element in the tasks array.

curl -s "http://localhost:8083/connectors/source-debezium-orders-00/status"
| jq '.tasks[0].trace'

"org.apache.kafka.connect.errors.ConnectException\n\tat
io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)\n\tat
io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:197)\n\tat
io.debezium.connector.mysql.BinlogReader$ReaderThreadLifecycleListener.onCommunicationFailure(BinlogReader.java:1018)\n\t
at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:950)\n\tat
com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:580)\n\tat
com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:825)\n\tat
java.lang.Thread.run(Thread.java:748)\nCaused by: java.io.EOFException\n\tat
com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.read(ByteArrayInputStream.java:190)\n\tat
com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.readInteger(ByteArrayInputStream.java:46)\n\tat
com.github.shyiko.mysql.binlog.event.deserialization.EventHeaderV4Deserializer.deserialize(EventHeaderV4Deserializer.java
:35)\n\tat
com.github.shyiko.mysql.binlog.event.deserialization.EventHeaderV4Deserializer.deserialize(EventHeaderV4Deserializer.java
:27)\n\tat
com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer.nextEvent(EventDeserializer.java:212)\n\tat
io.debezium.connector.mysql.BinlogReader$1.nextEvent(BinlogReader.java:224)\n\tat
com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:922)\n\t... 3 more\n"

Next, read through the trace and look for clues. In this instance, upon reviewing, you may notice that there is a Connect exception and also that something is wrong with the binlog. Perhaps you recall that the binlog is how the Debezium connector communicates with mySQL, making itself look like a read replica. So you could conclude from the stack trace that there is potentially something wrong with the network connection, and start there.

The Log Is the Source of Truth

In addition to the stack trace, you should read the log. There are different ways to access the log, depending on how you are running Connect:

  • If you are just running the Confluent CLI locally, the command is confluent local services connect log
  • If you are using Docker, it’s docker logs, plus the name of the container
  • If you are running completely vanilla Connect using Apache Kafka, you can just read the log files with cat, or more likely tail (the location varies by installation)

Connector contexts were added to logging in Apache Kafka 2.3 with KIP-449, and they make the diagnostic process a lot easier.

“Task is being killed and will not recover until manually restarted”

This is a general error, a symptom of the problem rather than the cause, and it doesn’t reveal any information about the underlying problem. When you see this, this is a sign that you need to search further in the stack trace or the log for your problem. For example, you can see this error in the stack trace below, and it’s a sign that you should look further up in the stack to the exceptions in order to identify your problem.

kafka-connect-stack-trace-limbo

At this point, you are only at the beginning of troubleshooting the problem but at least you know where to look: There is something about the network connection or about the byte-level IO between the source database and the connector that is causing the problem. You might even consider posting your problem to the Confluent Community Forum, but just keep in mind that a useful post will elaborate upon the “Task is being killed” error alone.

Dynamic Log Configuration

Dynamic log configuration arrived in Apache Kafka 2.4. It means you can change the level of logging detail without having to restart the worker.

# List current logger configuration
curl -s http://localhost:8083/admin/loggers/ | jq
{
     "org.apache.kafka.connect.runtime.rest": {
         "level": "WARN"
      },
     "org.reflections": {
          "level": "INFO"
      },
      "root": {
         "level": "INFO"
       }
}

For example, perhaps there is a particular connector such as io.debezium above that you’d like to log at TRACE level to try and troubleshoot. If you set everything to TRACE, it would be overwhelming. Using dynamic log configuration, you can conveniently do so at runtime via REST without restarting Connect, and targetting the specific logger of interest.

curl -s -X PUT -H "Content-Type:application/json" \
    http://localhost:8083/admin/loggers/io.debezium \
    -d '{"level": "TRACE"}'

Use the promo code CONNECT101 to receive $101 of free Confluent Cloud usage

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.