Senior Curriculum Developer
Your identities will need access to Confluent Cloud, whether it is to create applications or send and receive data. Allowing them to access your cluster as easily as possible is important to how well your business runs. At the same time, your access model needs to be structured in a way where you can easily add, remove, change, and verify permissions.
Authorization is where you will make sure that your authenticated users have the access they should have, and no more or no less. It is not unheard of to have to authorize hundreds, thousands, or even tens of thousands of identities. If you’ve ever used the open source implementation of Kafka, you may be familiar with the burdensome process of having to create an LDAP store to configure your group, role, and user hierarchies, then applying ACL based on the group and role hierarchies, and finally implementing a custom authorizer that pairs the groups and users in LDAP with those in Kafka.
None of that extra effort is necessary with Confluent Cloud. Instead, there are two methods for authorizing your identities: access control lists (ACLs), and role-based access control (RBAC).
ACLs are tables that store identities and what they can do or see (the resources they can access and permissions they have) within Confluent Cloud. The important thing to remember is that permissions are tied to each identity and linked to the access they have been given for each resource. If an identity changes teams or scope, you will have to make sure to address things at an individual level. If at any point even one of these changes is missed, or configured incorrectly, you now have identities that have access to more than what they should.
This problem is only made worse as the number of identities in your organization increases. Verifying permissions for compliance with laws and regulations can become quite a labor-intensive and time-consuming process, not to mention a potential security risk.
ACLs are specific to Kafka resources and don’t extend to other Confluent Cloud concepts, such as environments and organizations. Managing ACLs for a small number of identities likely isn’t a big deal. However, if you are working with a large organization with hundreds or thousands of identities, using ACLs doesn’t scale. You’re left with the second option for organizing identities, role-based access control.
RBACs allow you to configure predefined roles within your organization. Identities are assigned to a role and gain access to an organization, environment, cluster, or specific Kafka resources like topics, consumer groups, and transactional IDs based on that role.
For example, imagine a user, Milton, who is part of a group of users that require access to read data from your Purchases topic. Using ACLs, Milton’s identity is granted permission to access the read data. If tomorrow Milton also requires access to the read data from a topic called Returns, that permission must be added to Milton’s user identity (and the identity of everyone else who also now needs access).
Using RBACs, Milton is assigned to the DeveloperRead role. Each identity assigned to the DeveloperRead role has access to read data from the Purchase topic. Adding access to the Returns topic can be achieved by changing the DeveloperRead role scope to now include access to the Returns topic. Every identity within that role gains the correct access. As opposed to ACLs, RBAC integrates with a centralized identity management system and allows much simpler scaling for large organizations. From a compliance perspective, it’s safer and simpler to verify your RBAC roles to prove compliance than to attempt to confirm each individual ACL identity.
Confluent Cloud has the following roles preconfigured. You may want to keep a copy of this table to reference as you begin to plan your access architecture. You may want to reference the documentation for more in-depth details on each of these roles.
Also, there are a couple of things to keep in mind as you use RBAC in Confluent Cloud:
If you use OAuth for authentication you will be creating identity pools for your principals. There are two parts to every identity pool: who can use the pool, and what the pool can access. The “who” is a set of conditions that the identity needs to satisfy in order to use the pool. The “what” is defined by ACLs and RBAC roles.
You can use a mix of ACLs and RBACs. This may be helpful if you need to provide a small set of identities with access to a resource in your Confluent Cloud cluster. However, as an investment in the future, we recommend going with RBACs over ACLs.
As both ACLs and RBAC provide authorization, there is an order of precedence in granting access:
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.
Your identities, whether developers, producers, or consumers, need access to Confluent Cloud, whether that's to create applications or send and receive data to keep your business and information flowing. Allowing them to have access to your cluster as easily as possible is important to how well the business runs. At the same time, your access model needs to be secured in such a way that you can easily add, remove, and change and verify permissions. While we've talked about authentication, getting your identities into the system, authorization is the process of making sure that everyone has the access they should, no more and no less. A good example of authorization is when you go to an airport to get on an airplane. In order to get past security, you have to have a valid plane ticket and form of identification. It doesn't particularly matter which ticket, other than it is the one that flies out of that airport and within a reasonable timeframe. This is the authentication step we talked about earlier in the course. Once you get to the gate, you present your credentials, in this case, your ticket, and they let you board and sit in your assigned seat. You can't use your ticket to access other planes or other seats, only what has been assigned to you. This is the authorization stage. You are only authorized to board one plane and sit in one seat. This is important. As a passenger, you don't have access to the cockpit or other restricted areas of the plane. This is an important concept in computer security commonly referred to as the principle of least privilege, giving our identities minimal privileges based on their job necessities. Authorizing hundreds, thousands, or even tens of thousands of identities isn't unheard of, especially when talking about users and service accounts. If you've ever used the open source implementation of Kafka, you are likely familiar with this issue, having to create an LDAP store to configure your group, role and user hierarchies, then applying access control lists based on the group and role hierarchies, and finally implementing a custom authorizer that pairs the groups and users in LDAP with those in Kafka. None of that extra effort is necessary with Confluent Cloud. Instead, there are two methods for authorizing your identities: Access Control Lists, or ACLs, and Role-Based Access Control, or RBAC. ACLs are tables that list identities and what they can do or see within Confluent Cloud, or said another way, the resources they can access and the permissions they have. So, the user Milton can access the purchases topic to consume events. The important thing to remember is that the permissions are tied to each identity and linked to the access they have been given for each resource. Any time an identity changes teams or scope, you have to make sure to address these things at an individual level. If at any point, even one of these changes is missed, or configured incorrectly, you now have identities that have access to more than what they should. This problem is only made worse as the number of identities in your organization increases. When we talk about compliance, this can create a headache. You now have identities that have access to more than what they should. This problem is only made worse as the number of identities in your organization increases. When we talk about compliance, this can also create a headache, as verifying permissions can become quite a labor-intensive and time-consuming process, not to mention a potential security risk. It is important to to remember that ACLs are specific to Kafka resources and don't extend to other Confluent Cloud concepts, such as environments and organizations. Managing ACLs for a small number of identities likely isn't a big deal, however, if you're working with a large organization, or hundreds or thousands of identities, using ACLs doesn't scale. The overhead to manage ACLs on an individual basis can be painful and lead to mistakes being made, thus making your system less secure, which brings us to the second way of authorizing our principles: Role-Based Access Control, or RBAC. RBAC is based on configuring predefined roles within your organization. Identities are assigned to a role and then gain access to an organization, environment, cluster, or specific Kafka resources like topics, consumer groups, and transactional IDs. So, to use our example from earlier, the user Milton is part of a group of users that have access to read data from our purchases topic, so we add him to the developer read role for that topic. From a user's perspective, nothing has changed. Using ACLs, he had access to read the data, and using RBAC, he can still read that same data. However, let's say that we have a new topic, returns, that we would like everyone that has access to the purchases topic to also be able to read. Once we change the RBAC scope, every identity within that role gains the correct access. The same goes for when we need to remove access for a set of identities. RBAC, as opposed to ACLs, integrate with a centralized identity management system and allows much simpler scaling for large organizations. From a compliance perspective, it's much easier to verify your RBAC roles to prove compliance than trying to look at each individual ACL identity. Confluent Cloud has the following roles preconfigured. I recommend that you keep a copy of this table to reference as you begin to plan your access architecture. You'll also want to reference the documentation for the more in-depth details on each of these roles. There are a couple things to keep in mind as you use RBAC in Confluent Cloud. RBAC for Kafka is available only on standard and dedicated clusters. To create service accounts, you must be granted the organization admin role. All the other cluster administration roles, EnvironmentAdmin, CloudClusterAdmin, and ResourceOwner, can grant or revoke permissions for existing service accounts on the resources that they control. Role bindings are limited like every other resource on Confluent Cloud, so you'll want to check the official documentation to see the current guidance. You must use version 2.8.1 or later of the Confluent CLI to manage RBAC roles. Some of you may be asking, "Can I use ACLs with RBAC?" Yes, there are times when you may need to provide a small set of identities with access to a resource in your Confluent Cloud cluster. However, as an investment in the future, we recommend going with RBAC over ACLs, as both RBAC and ACLs provide authorization, there is an order of precedence in granting access. This screen is taken directly from the Confluent Cloud documentation and I recommend referencing it as you create your ACL deny and allow rules. One last consideration. If you're using OAuth for authentication, you will be creating identity pools for your principles. There are two parts to every identity pool: who can access the pool, and what the pool can access. The who is a set of conditions that the identity needs to satisfy in order to use the pool. The what is defined by ACLs and RBAC roles. Now let's take a look at how to set up and manage authorization using Confluent Cloud.