Audit Logs

5 min

Dan Weston

Senior Curriculum Developer

Audit Logs

Up until now, we have dealt with protecting your data, your system, and its resources with respect to specific events: the login process as well as access to resources after a successful login.

However, your system may be targeted in some manner outside of these more orderly sequences. For example, a rogue client may spawn fake messages or you may experience a DDoS-style attack on broker resources. If one of these happened, how would you know you had been targeted, and how could you identify the perpetrators as well as the sequence of events? Furthermore, is there a way that you can prevent future attacks?

Audit logs can help because they provide records of everything that has happened on a system. Specifically, they provide:

Insight – They provide insight into situations such as trying to determine if a particular group of users was successful in authenticating and getting access to the correct broker resources after a new ACL was added
Security – They enhance security by letting you identify anomalies and unauthorized operations in the historical record so that you can take action as quickly as possible
Impact – They let you see who, as well as which services, have been impacted by unusual activities, so that you can communicate with stakeholders as the situation progresses
Compliance – They enable you to generate audit reports according to internal policies and external regulations, and also provide an official record in the event of a security breach

Apache Kafka doesn't provide audit logging out of the box. It does, however, support comprehensive audit logging with Log4j, which can be configured to deliver events from separate logger instances to separate destinations (typically files). By configuring separate appenders for the log4j.properties file on each broker, you can capture detailed information for auditing, monitoring, and debugging purposes.

Kafka Authorizer Logger

kafka.authorizer.logger is used for authorization logging and generates INFO entries for events where access was denied and DEBUG entries for those where access was allowed. Each entry includes the principal, client host, the attempted operation, and the accessed resource (e.g., a topic):

log4j.properties

log4j.logger.kafka.authorizer.logger=DEBUG

output

DEBUG Principal = User:Alice is Allowed Operation = Write from host = 127.0.0.1 on resource = Topic:LITERAL:customerOrders for request = Produce with resourceRefCount = 1 (kafka.authorizer.logger)

INFO Principal = User:Mallory is Denied Operation = Describe from host = 10.0.0.13 on resource = Topic:LITERAL:customerOrders for request = Metadata with resourceRefCount = 1 (kafka.authorizer.logger)

Kafka Request Logger

The kafka.request.logger logs the principal and client host at the DEBUG level as well as full details of the request when logging at the TRACE level:

log4j.properties

log4j.logger.kafka.request.logger=DEBUG

output

DEBUG Completed request:RequestHeader(apiKey=PRODUCE, apiVersion=8, clientId=producer-1, correlationId=6) -- {acks=-1,timeout=30000,partitionSizes=[customerOrders-0=15514]},response: {responses=[{topic=customerOrders,partition_responses=[{partition=0,error_code=0 ,base_offset=13,log_append_time=-1,log_start_offset=0,record_errors=[],error_mes sage=null}]}],throttle_time_ms=0} from connection 127.0.0.1:9094-127.0.0.1:61040-0;totalTime:2.42,requestQueueTime:0.112,local-Time:2.15,remoteTime:0.0,throttleTime:0,responseQueueTime:0.04,sendTime: 0.118,securityProtocol:SASL_SSL,principal:User:Alice,listener:SASL_SSL,clientInf ormation:ClientInformation(softwareName=apache-kafka-java, softwareVersion=2.7.0-SNAPSHOT) (kafka.request.logger)

Considerations

Kafka's logs are bound to be useful to you, whether you use them for compliance, debugging, or anomaly detection. However, there are a few things to keep in mind:

Make sure there is enough disk space if you are capturing logs for each broker and set an appropriate Log4j retention policy so as not to fill the disks. The request logger is particularly voluminous so you may only want to use it for debugging or within small retention windows.
Since logs are per broker, you will need to consolidate all of the broker logs from the cluster to get a complete picture of your system.
Kafka doesn't provide log analysis or visualization so we recommend that you use something like the "ELK" stack (Elasticsearch, Logstash, and Kibana) for these purposes. You might also direct your log output to a Kafka topic on another cluster and then perform a streaming analysis of your audit events!

Do you have questions or comments? Join us in the #confluent-developer community Slack channel to engage in discussions with the creators of this content.

Use the promo code 101SECURITY & CONFLUENTDEV1 to get $25 of free Confluent Cloud usage and skip credit card entry.

Get Started

Audit Logs

Much of the security material we've discussed so far helps to protect your data, your system, and its resources when particular events occur. Authentication helps identify and verify clients as connections are opened. Authorization helps secure access to one or more resources with each request. But if something bad were to happen, let's say a rogue client spawns multiple topics and produces fake messages, where someone repeatedly tries to open many connections in quick succession using invalid credentials in an attempt to exhaust broker resources. How would you know your system has been targeted, who was involved, and the sequence of events that has led to your current unfortunate state. Going further, is there any way of being able to anticipate or prevent future attacks as new threats emerge? This is where audit logs can help. An audit log is a record of everything that has happened on your system. With an audit log, you can improve your security posture by first gaining insight. Logs provide insight into what is happening in your system. Just rolled out a new access control list for a new group? Using audit logs, you can go back and see if users have been able to authenticate and if they're authorized to the correct resources. Number two: Improve security. By monitoring and validating operations, you can detect unauthorized operations in the historical record and trace any other interactions associated with the client or host attempting unauthorized operations. As you look at the logs, are there things that stick out or seem strange? With logs, you can identify anomalies and take action as quickly as possible. Number three: understand impact. You can use the logs to debug client interactions and spot unusual activities and system metrics. The logs can help you see who and what is impacted if there are issues. If there's an ongoing issue, the log allows you to see who and what services are impacted, what got you to that point, and then allow you to communicate with stakeholders as the situation progresses. Number four: prove compliance. Logs can help you generate audit reports in line with internal policies and external regulations and create an official record if there is a security breach. Apache Kafka doesn't provide anything called an audit log directly out of the box. It does, however, support comprehensive application logging using Log4J. Log4J can be configured with multiple appenders that deliver log events from different logger instances to different destinations, typically files. There are two logger instances, Kafka.authorizer.logger, which is used for authorization logging and Kafka.request.logger. By configuring separate appenders for the Kafka authorized logger and Kafka request logger, logger instances, in the log.4J properties file on each broker, you can capture detailed authorization and request information for auditing, monitoring, and debugging purposes. The authorization logger generates info entries for operations that are denied and debug entries for operations which access was granted. Both types of entry include details of the user principle, the client host, the operation being attempted and the resource, such as topic, to which the operation was directed. The request logger adds details of the user principle and the client host at the debug level and full details of the request when logging at the trace level. Kafka's application logging provides a solid basis for creating and analyzing analogs to meet your specific audit needs, whether it's anomaly detection, debugging, or compliance. Of course there are a few things to be aware of. If you're logging to files on each broker, you should ensure there's sufficient disc space and an appropriate Log4J retention policy so that you don't fill the discs. The request logger in particular can generate a lot of data very quickly, so you may want to consider enabling it only for debugging purposes or using a very small retention window. Kafka's application logs are per broker. To get a complete picture of what is happening in your system, you'll need to consolidate the logs from all brokers in a cluster to help analyze and visualize this consolidated output. You could even direct the log output to a Kafka topic or another cluster. We recommend using a commercial or open source log analysis and observability framework, such as the ELK stack, that is Elasticsearch, Logstash, and Kibana to help analyze and visualize this consolidated output. You could even direct the log output to a Kafka topic in another cluster and then perform streaming analysis on your audit events.

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Articles

Patterns

FAQs

Blog

NEWStreamables

NEWLearn More

Language Guides

Tutorials

Demos

Language Guides

Tutorials

Demos

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

Meetups

Ask the Community

Community Catalysts

NEWCommunity Use Cases

Confluent Developer Newsletter

Data Streaming Awards

NEWCurrent 2024

Kafka Summit 2024 - Bangalore

Kafka Summit 2024 - London

Current 2023

Kafka Summit 2023

NEWKafka® 101

NEWApache Flink® SQL

NEWApache Flink® Table API: Processing Data Streams in Java

NEWDesigning Event-Driven Microservices

NEWApache Flink® 101

NEWBuilding Flink® Apps in Java

NEWKafka® 101

Kafka® Connect 101

Kafka Streams 101

Schema Registry 101

ksqlDB 101

Data Mesh 101

Articles

Patterns

FAQs

Blog

Modules: Start from lesson 1
Total 11