Senior Curriculum Developer
One way that Kafka provides security is through built-in authentication. Similar to checking an ID, authentication is the act of verifying the identity of an entity requesting access to a system component. In a Kafka-based system, there are many different interactions that begin with participants authenticating the components with which they are communicating.
For example, when a connection is established between a client (a user, application, or service) and a broker, each side of the connection will usually wish to verify the other. The same holds true when two brokers connect—each may verify the other. A final authentication scenario is a broker accessing ZooKeeper, whereby the broker may be required to authenticate before being allowed to access sensitive cluster metadata.
Internally in Kafka, a client's identity is represented using a KafkaPrincipal object, or principal. So, for example, if you connect to Kafka and authenticate with a username and password, the principal associated with the connection will represent your username.
Your principal is assigned your authorizations in the target system, as we will learn in the Authorization module, and it is also used to log details of any permissible operation you perform—as we will learn in the Audit Logs module.
Note that all requests are assigned a principal, even if authentication hasn't been enabled for a connection. In this case, the principal associated with the connection would be ANONYMOUS, and granting access to this type of user, particularly in a production environment, should generally be avoided.
You can configure authentication for client-broker communication and broker-broker communication independently, but the available authentication mechanisms are the same. Essentially, when configuring the broker, you specify listeners, which determine the hostname or IP address and port that can be used to reach the broker. Each listener also specifies the security protocol needed to authenticate, whether SSL or SASL_SSL. Note that a broker can be configured with more than one listener, i.e., it can use various address/port/security protocol combinations.
The following broker configuration snippet specifies three listeners for a broker: an external network listener, an internal listener, and an inter-broker listener (the inter-broker and internal listeners are configured to use SSL, the external listener SASL_SSL):
listeners=EXTERNAL://:9092,INTERNAL://10.0.0.2:9093,BROKER://10.0.0.2:9094
advertised.listeners=EXTERNAL://broker1.example.com:9092,INTERNAL:// broker1.local:9093,BROKER://broker1.local:9094
listener.security.protocol.map=EXTERNAL:SASL_SSL,INTERNAL:SSL,BROKER:SSL
inter.broker.listener.name=BROKER
Relatedly, here is a config snippet for a client, specifying that SASL_SSL should be used to communicate with the listed bootstrap servers:
security.protocol=SASL_SSL
bootstrap.servers=broker1.example.com:9092,broker2.example.com:9092
It's a staple of every crime TV show on TV. Our detective visits a jumpy witness who squints warily through a crack in the door. "How do I know you're the detective?" The detective flashes a badge, and the witness grudgingly opens the door. This is authentication. Before I even decide to let you in, I need to know who you are. More than that, I need some proof that confirms your identity. Authentication then is all about establishing identity. In the context of a Kafka-based system, there are many different interactions that begin with participants verifying the identity of the components with which they're communicating. When a connection is established between a client, whether a user, application, or service and a broker, the broker may seek to identify the client opening the connection, while the client in turn may want to confirm the identity of the broker with which it is connecting. Likewise, when brokers connect to one another to replicate data, they can be configured to confirm each other's identities. And finally, if you're using ZooKeeper in your setup, you may want to secure the broker ZooKeeper interactions by requiring brokers to authenticate to ZooKeeper before accessing any sensitive cluster metadata. Internally, Kafka represents a client's identity using an object called a KafkaPrincipal. If you connect a Kafka and authenticate using a username and password, for example, the KafkaPrincipal associated with the connection will represent your username. It tells Kafka you are who you say you are. Later in the authorization module, we'll see how Kafka uses this identity, the Principal, to determine what you're allowed to do, and then the auditing module will see how principles are used to log details of the user or application performing an operation that has been secured using Kafka's access controlled mechanisms. Kafka uses principles to associate users with requests even if authentication has not been configured for a connection. When authentication has not been enabled, the principle associated with the connection is anonymous. In production environments, avoid granting access to anonymous users unless the intention is to give everyone permission to access the broker. Generally, this isn't a good idea. There are two types of interaction that you can authenticate with Kafka: clients, whether users, applications or services communicating with Kafka brokers, and brokers communicating with other brokers. You can configure authentication for these two types of interaction independently of one another, but the authentication mechanisms you can use are the same for both. So how do you actually configure authentication in a Kafka-based system? There are two things you need to know about here: listeners and security protocols. When you configure a broker, you specify one or more listeners which determine host names or IP addresses and ports clients can use to reach the broker. Each listener also specifies a security protocol that will be used to authenticate connection requests. Kafka supports four different connection protocols: PLAINTEXT, SASL_PLAINTEXT, SSL, and SASL_SSL. Only SSL and SASL_SSL are secure. Since this course is about securing Kafka, we'll only be taking a look at SSL and SASL_SSL. A broker can be configured with more than one listener. That is, there may be several address, port and security protocol combinations that can be used to reach the broker. The snippet of broker configuration shown here configures three listeners for a broker: a listener for connections from an external network, a listener for connections from an internal network, and a listener for interbroker communications. The interbroker and internal listeners are configured to use SSL. The external listener, SASL_SSL, this snippet of client configuration specifies that the SASL_SSL security protocol should be used to communicate with the listed bootstrap servers. Now that we have a good understanding of the basics of authentication, in the next video we'll take a look at Kafka's two security protocols, SSL and SASL_SSL, in-depth.
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.