Get Started Free
‹ Back to courses
course: Apache Kafka® Security

Securing ZooKeeper

7 min
dan-weston

Dan Weston

Senior Curriculum Developer

Securing ZooKeeper

If you've chosen to run Kafka with ZooKeeper, you will also need to consider how to secure ZooKeeper, as it stores a lot of important cluster configuration and security information, such as ACLs, the list of a cluster's broker members, and partition metadata, and if you are using Kafka's SASL/SCRAM provider for authorization—encrypted versions of user passwords. (Note that if you are using KRaft, you can skip this module entirely.)

Similar to Kafka, there are two ways to secure ZooKeeper: SSL and SASL. No matter which one you use, you’ll need to update your broker configuration with zookeeper.set.acl=true, which makes sure that secure ACLs in ZooKeeper are associated with the metadata in ZooKeeper. These ACLs specify that the metadata can be read by everyone but only changed by the brokers. Sensitive metadata, such as SCRAM credentials, is an exception: it can only be read by the brokers by default.

SSL

SSL client authentication in ZooKeeper uses certificates to authenticate ZooKeeper to brokers and vice versa. There is one difference, however, between the usual Kafka SSL authentication and the SSL here: each broker (as well as CLI tool) accessing ZooKeeper must use the same certificate “Distinguished Name” when accessing it. A certificate’s Distinguished Name is part of the identifying information used to create a certificate. Essentially, the ZooKeeper ACL includes its Distinguished Name and will only authorize brokers with the same one (it will authenticate brokers with other Distinguished Names but will only authorize brokers with the correct one).

A Distinguished Name usually includes the hostname of the entity to which it has been issued, so as to allow the server (ZooKeeper in this case) to verify the hostname. So if you have multiple brokers on multiple machines (i.e., with different hostnames), you may find that they are unable to access ZooKeeper.

The solution to this problem is either to use a wildcard certificate (which includes a wildcard in the hostname), or to use a certificate with a single hostname plus a subject alternative name (SAN), which includes a list of all of the hostnames of the brokers in the cluster. The single hostname will be used in the ACL, while the SAN allows ZooKeeper to verify the actual hostname of the broker.

SASL

Similar to Kafka, ZooKeeper supports SASL, which you are most likely to use if you need to integrate with an existing Kerberos server, such as Active Directory. For SASL, you should ensure that connections from Kafka brokers to ZooKeeper are encrypted with TLS but not authenticated by setting ssl.clientAuth=none in your ZooKeeper configuration. This lets clients connect to ZooKeeper using a TLS-encrypted connection but without a certificate. With SASL for Kerberos, the problem of using the same identity for all brokers for ACL and authorization purposes is easily solved, you just have to make sure that the ZooKeeper client file on each broker is configured with the same principal.

SSL vs. SASL

SSL has some administrative overhead but is the most popular and is probably the easiest option to use. SASL of course makes sense if you need to integrate with an existing Kerberos server. You can actually use both, however, use SSL for authentication and use SASL to determine access and grant authorization. This has the benefit of clients not needing to use the same Distinguished Name and thus you won't have to set a unique subject alternative name (SAN).

Network Segmentation

Although the measures we've discussed in this module will help secure ZooKeeper, a final recommendation is to use network segmentation. This way only your brokers and administrative tools will have access to ZooKeeper.

Use the promo code 101SECURITY & CONFLUENTDEV1 to get $25 of free Confluent Cloud usage and skip credit card entry.

Securing ZooKeeper

A system is only as secure as its weakest component. Zookeeper stores a lot of important cluster configuration and security information, including things such as a list of brokers that are currently members of the cluster, and Kafka ACLs, which are used for authorizing client requests to Kafka brokers. Zookeeper is also used to store consumer metadata, such as partition offset. With the introduction of its own native consensus implementation based on Raft, called KRaft, Kafka now supports Zookeeper list deployments. With KRaft, all of the metadata that was formerly stored in Zookeeper is now maintained in replicated internal topics. Now when you deploy Kafka, you can choose to run with either the traditional Zookeeper-based setup or KRaft. If you've chosen to run Kafka with Zookeeper, this module is for you. If you're using KRaft, you can skip this module. Decided? Okay, let's continue. As we mentioned in the beginning, in a Kafka cluster that includes Zookeeper, Zookeeper is used to store important metadata about your cluster and its clients, things such as ACLs and partition metadata. If you're securing using Kafka SASL or SCRAM provider for authorization, then Zookeeper is also being used to store encrypted versions of user passwords. This is all very important metadata. Any inappropriate access or manipulation of this metadata could seriously disrupt or compromise your cluster. If you're securing Kafka, then you should also be securing Zookeeper. If Zookeeper is left unsecured, then no matter how secure the rest of the system, it's still susceptible to being compromised. As with Kafka, there are two different ways of securing Zookeeper: SSL client authentication and SASL. Irrespective of the authentication mechanism you use, update your broker configuration with zookeeper.set.acl=true. This ensures that secure Zookeeper Access Control Lists, ACLs, are associated with the metadata in Zookeeper. These ACLs in turn ensure that the metadata can be read by everyone but modified only by the brokers. The exception here is sensitive metadata, such as SCRAM credentials, which cannot be read by everyone but only by the brokers by default. Zookeeper's SSL client authentication mechanism uses certificates to authenticate Zookeeper to brokers and the brokers to Zookeeper. There's one wrinkle with the setup, however, that distinguishes it from SSL client authentication in Kafka. Every broker, and in fact, every CLI tool, must use the same Distinguished Name when accessing Zookeeper. A certificate's Distinguished Name is the identifying information used to create the certificate and is part of the certificate. The reason Distinguished Name, or DN, must be the same for all brokers is because the DN is included in the Zookeeper ACL and Zookeeper only authorizes ACLs. Effectively, it only authorizes that Distinguished Name. Zookeeper will quite happily authenticate brokers with different Distinguished Names, but it will only authorize brokers with the same Distinguished Name. If broker A creates a piece of metadata in Zookeeper, broker B will only be authorized to modify it if it shares the same DN as broker A. A Distinguished Name typically includes the hostname of the entity to which it has been issued so as to allow the server, Zookeeper in this instance, to verify the hostname of the client making the request. If you've multiple brokers in your cluster running on different machines, then you'll have different hostnames. If you allow Distinguished Names to vary among broker certificates, you may find that brokers cannot access Zookeeper. The solution here is to either use a wildcard certificate, which includes a wildcard in the hostname, or a certificate with a single Distinguished Name plus a Subject Alternative Name, or SAN, containing a list of the actual hostnames of each of the brokers in the cluster. This single shared Distinguished Name will be used in the ACL, while the Subject Alternative Name allows Zookeeper to verify the actual hostname of the broker issuing a request. It's a simple point. Every client must use the same identity when being authorized by Zookeeper. It's just that the configuration and administration when using SSL client authentication require a bit of care and attention. Besides SSL client authentication, Zookeeper also supports SASL. As with Kafka, the main use case here is integration with an existing Kerberos server, such as Active Directory. If you use SASL, you should also ensure that connections from Kafka brokers to Zookeeper are encrypted using TLS. You can do this by setting ssl.clientAuth=none in the Zookeeper configuration. When set to none, Zookeeper allows clients to connect using a TLS encrypted connection without having to present their own certificate. With SASL and Kerberos, the problem of using the same identity for all brokers for ACL and authorization purposes is easily solved. All you have to do is ensure the Zookeeper client file on each broker is configured with the same principle. Which should you choose, SSL client authentication or SASL? SSL client authentication has some administrative overhead in terms of generating and installing certificates. You'll also need to periodically update this information before the certificates expire to prevent TLS handshake failures. Nonetheless, this is probably the easiest and the most popular option. SASL makes sense if you're already using Kerberos in your environment. If you go down this route, you'll need to integrate with your Kerberos server and ensure that these additional interactions are locked down. You can, in fact, use both. If you're using SSL client authentication in conjunction with SASL, then either the SSL identity or the SASL identity may be used to determine access and grant authorization, so all clients don't have to use the same Distinguished Name and therefore you can use hostnames in the Distinguished Name. While the measures we've discussed here will help secure Zookeeper, it is highly recommended that you also use network segmentation so that only your brokers and administrative tools have access to Zookeeper.

Be the first to get updates and new content

We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.