Senior Curriculum Developer
Up until now, we have dealt with protecting your data, your system, and its resources with respect to specific events: the login process as well as access to resources after a successful login.
However, your system may be targeted in some manner outside of these more orderly sequences. For example, a rogue client may spawn fake messages or you may experience a DDoS-style attack on broker resources. If one of these happened, how would you know you had been targeted, and how could you identify the perpetrators as well as the sequence of events? Furthermore, is there a way that you can prevent future attacks?
Audit logs can help because they provide records of everything that has happened on a system. Specifically, they provide:
Apache Kafka doesn't provide audit logging out of the box. It does, however, support comprehensive audit logging with Log4j, which can be configured to deliver events from separate logger instances to separate destinations (typically files). By configuring separate appenders for the log4j.properties file on each broker, you can capture detailed information for auditing, monitoring, and debugging purposes.
kafka.authorizer.logger is used for authorization logging and generates INFO entries for events where access was denied and DEBUG entries for those where access was allowed. Each entry includes the principal, client host, the attempted operation, and the accessed resource (e.g., a topic):
log4j.properties
log4j.logger.kafka.authorizer.logger=DEBUG
output
DEBUG Principal = User:Alice is Allowed Operation = Write from host = 127.0.0.1 on resource = Topic:LITERAL:customerOrders for request = Produce with resourceRefCount = 1 (kafka.authorizer.logger)
INFO Principal = User:Mallory is Denied Operation = Describe from host = 10.0.0.13 on resource = Topic:LITERAL:customerOrders for request = Metadata with resourceRefCount = 1 (kafka.authorizer.logger)
The kafka.request.logger logs the principal and client host at the DEBUG level as well as full details of the request when logging at the TRACE level:
log4j.properties
log4j.logger.kafka.request.logger=DEBUG
output
DEBUG Completed request:RequestHeader(apiKey=PRODUCE, apiVersion=8, clientId=producer-1, correlationId=6) -- {acks=-1,timeout=30000,partitionSizes=[customerOrders-0=15514]},response: {responses=[{topic=customerOrders,partition_responses=[{partition=0,error_code=0 ,base_offset=13,log_append_time=-1,log_start_offset=0,record_errors=[],error_mes sage=null}]}],throttle_time_ms=0} from connection 127.0.0.1:9094-127.0.0.1:61040-0;totalTime:2.42,requestQueueTime:0.112,local-Time:2.15,remoteTime:0.0,throttleTime:0,responseQueueTime:0.04,sendTime: 0.118,securityProtocol:SASL_SSL,principal:User:Alice,listener:SASL_SSL,clientInf ormation:ClientInformation(softwareName=apache-kafka-java, softwareVersion=2.7.0-SNAPSHOT) (kafka.request.logger)
Kafka's logs are bound to be useful to you, whether you use them for compliance, debugging, or anomaly detection. However, there are a few things to keep in mind:
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.
Much of the security material we've discussed so far helps to protect your data, your system, and its resources when particular events occur. Authentication helps identify and verify clients as connections are opened. Authorization helps secure access to one or more resources with each request. But if something bad were to happen, let's say a rogue client spawns multiple topics and produces fake messages, where someone repeatedly tries to open many connections in quick succession using invalid credentials in an attempt to exhaust broker resources. How would you know your system has been targeted, who was involved, and the sequence of events that has led to your current unfortunate state. Going further, is there any way of being able to anticipate or prevent future attacks as new threats emerge? This is where audit logs can help. An audit log is a record of everything that has happened on your system. With an audit log, you can improve your security posture by first gaining insight. Logs provide insight into what is happening in your system. Just rolled out a new access control list for a new group? Using audit logs, you can go back and see if users have been able to authenticate and if they're authorized to the correct resources. Number two: Improve security. By monitoring and validating operations, you can detect unauthorized operations in the historical record and trace any other interactions associated with the client or host attempting unauthorized operations. As you look at the logs, are there things that stick out or seem strange? With logs, you can identify anomalies and take action as quickly as possible. Number three: understand impact. You can use the logs to debug client interactions and spot unusual activities and system metrics. The logs can help you see who and what is impacted if there are issues. If there's an ongoing issue, the log allows you to see who and what services are impacted, what got you to that point, and then allow you to communicate with stakeholders as the situation progresses. Number four: prove compliance. Logs can help you generate audit reports in line with internal policies and external regulations and create an official record if there is a security breach. Apache Kafka doesn't provide anything called an audit log directly out of the box. It does, however, support comprehensive application logging using Log4J. Log4J can be configured with multiple appenders that deliver log events from different logger instances to different destinations, typically files. There are two logger instances, Kafka.authorizer.logger, which is used for authorization logging and Kafka.request.logger. By configuring separate appenders for the Kafka authorized logger and Kafka request logger, logger instances, in the log.4J properties file on each broker, you can capture detailed authorization and request information for auditing, monitoring, and debugging purposes. The authorization logger generates info entries for operations that are denied and debug entries for operations which access was granted. Both types of entry include details of the user principle, the client host, the operation being attempted and the resource, such as topic, to which the operation was directed. The request logger adds details of the user principle and the client host at the debug level and full details of the request when logging at the trace level. Kafka's application logging provides a solid basis for creating and analyzing analogs to meet your specific audit needs, whether it's anomaly detection, debugging, or compliance. Of course there are a few things to be aware of. If you're logging to files on each broker, you should ensure there's sufficient disc space and an appropriate Log4J retention policy so that you don't fill the discs. The request logger in particular can generate a lot of data very quickly, so you may want to consider enabling it only for debugging purposes or using a very small retention window. Kafka's application logs are per broker. To get a complete picture of what is happening in your system, you'll need to consolidate the logs from all brokers in a cluster to help analyze and visualize this consolidated output. You could even direct the log output to a Kafka topic or another cluster. We recommend using a commercial or open source log analysis and observability framework, such as the ELK stack, that is Elasticsearch, Logstash, and Kibana to help analyze and visualize this consolidated output. You could even direct the log output to a Kafka topic in another cluster and then perform streaming analysis on your audit events.