Software Practice Lead
We're going to build a pipeline to create both a staging and a production Confluent Cloud data streaming platform.
In a true GitOps fashion, instead of using the Web UI or the CLI to do so, we're going to use Terraform and GitHub actions to build a CI/CD pipeline and automate the deployment.
You will need a Confluent Cloud account, a GitHub account, and an AWS account to store the Terraform state (though you can adapt the terraform code to use a different backend). Make sure to use the promo code at the bottom of this page, it will give you enough credit to do this exercise!
First, create a directory for this exercise.
mkdir data-streaming-platform-pipeline && cd data-streaming-platform-pipeline
We need a Confluent Cloud API key which will be used by Terraform to create and update all Confluent Cloud resources programmatically.
We're going to create this API key via the Confluent Cloud CLI, you can install it with brew install confluentinc/tap/cli or follow the instructions here.
If you prefer, you can also use the Confluent Cloud UI and head over to "Cloud API Keys" in the Administration section. Just make sure to select "Granular access" to create the service account.
Let's create the service account with the CLI:
confluent iam service-account create terraform_runner --description "Service Account used by Terraform to create and update Confluent Cloud resources"
This is the output you should get:
+-------------+--------------------------------+
| ID | sa-56q6kz |
| Name | terraform_runner |
| Description | Service Account used by |
| | Terraform to create and update |
| | Confluent Cloud resources |
+-------------+--------------------------------+
We also must grant this account the OrganizationAdmin role, just use the service account id you've just created above in the command below:
confluent iam rbac role-binding create --principal User:sa-56q6kz --role OrganizationAdmin
Let's create an API key, again, use the service account id you've created:
confluent api-key --service-account sa-56q6kz create --resource cloud --description "Confluent Cloud API key for terraform_runner service account"
You should get something along those lines (note that I've deleted that key long before publishing this exercise).
It may take a couple of minutes for the API key to be ready.
Save the API key and secret. The secret is not retrievable later.
+------------+------------------------------------------------------------------+
| API Key | DIK7CITBPM5L6KC6 |
| API Secret | RV+GNsKQT6xLjI82wo1ItakGzFeTNwYUzBVKaWvOME+Hh4vdO2ZZp98r6HDzkfmh |
+------------+------------------------------------------------------------------+
Great, we have an API key that we can now use to run Terraform from within the CI/CD pipeline. Keep it in a safe place, we're going to use in the next steps.
Now, create a subdirectory called staging
mkdir staging && cd staging
In this directory, create a main.tf file with the following content:
# Configure the Confluent Provider
terraform {
required_providers {
confluent = {
source = "confluentinc/confluent"
version = "1.47.0"
}
}
}
# Configure the terraform backend to store the terraform state
terraform {
backend "s3" {
bucket = "platform-engineering-terraform-state"
key = "terraform/all-state/data-streaming-platform.tfstate"
region = "us-east-1"
encrypt = true
}
}
variable "confluent_cloud_api_key" {
description = "Confluent Cloud API Key (also referred as Cloud API ID)"
type = string
sensitive = "true"
}
variable "confluent_cloud_api_secret" {
description = "Confluent Cloud API Secret"
type = string
sensitive = "true"
}
variable "confluent_cloud_environment_name" {
default = "unnamed"
}
provider "confluent" {
cloud_api_key = var.confluent_cloud_api_key
cloud_api_secret = var.confluent_cloud_api_secret
}
resource "confluent_environment" "env" {
display_name = var.confluent_cloud_environment_name
}
resource "confluent_kafka_cluster" "standard" {
display_name = "standard_kafka_cluster"
availability = "SINGLE_ZONE"
cloud = "AWS"
region = "us-east-2"
standard {}
environment {
id = confluent_environment.env.id
}
}
# We use a dedicated service account, called 'platform-manager', per environment
resource "confluent_service_account" "platform-manager" {
display_name = "platform-manager-${confluent_environment.env.display_name}"
description = "Service account to manage the platform"
}
# The 'platform-manager' account is a cloud cluster admin.
resource "confluent_role_binding" "platform-manager-kafka-cluster-admin" {
principal = "User:${confluent_service_account.platform-manager.id}"
role_name = "CloudClusterAdmin"
crn_pattern = confluent_kafka_cluster.standard.rbac_crn
}
resource "confluent_api_key" "platform-manager-kafka-api-key" {
display_name = "platform-manager-kafka-api-key"
description = "Kafka API Key that is owned by 'platform-manager' service account"
owner {
id = confluent_service_account.platform-manager.id
api_version = confluent_service_account.platform-manager.api_version
kind = confluent_service_account.platform-manager.kind
}
managed_resource {
id = confluent_kafka_cluster.standard.id
api_version = confluent_kafka_cluster.standard.api_version
kind = confluent_kafka_cluster.standard.kind
environment {
id = confluent_environment.env.id
}
}
# The goal is to ensure that confluent_role_binding.app-manager-kafka-cluster-admin is created before
# confluent_api_key.app-manager-kafka-api-key is used to create instances of
# confluent_kafka_topic, confluent_kafka_acl resources.
# 'depends_on' meta-argument is specified to avoid having multiple copies of this definition in the configuration
# which would happen if we specify it in confluent_kafka_topic, confluent_kafka_acl resources instead.
depends_on = [
confluent_role_binding.platform-manager-kafka-cluster-admin
]
}
If we wanted to, we could install Terraform locally and run the Terraform code from there, but that's exactly what we want to avoid when doing GitOps. Instead, we want a visible, repeatable, audited mechanism to deploy our data streaming platform. It's time for automation!
First, create a .github/workflows directory in your repository. shell cd .. && mkdir -p .github/workflows && cd .github/workflows In this .github/workflows directory, we're going to create 3 GitHub Actions workflow files: ci.yml, cd.yml and promote.yaml
The ci.yml contains the CI workflow which will be run when you open or update a pull request. It runs a lint job and prints the terraform plan in a PR comment:
name: CI
on:
pull_request:
types:
- opened
- synchronize
paths-ignore:
- '**/README.md'
env:
# We're using AWS S3 as the Terraform backend
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
TF_BUCKET_STATE: ${{ vars.TF_BUCKET_STATE }}
# Credentials for Confluent Cloud
TF_VAR_CONFLUENT_CLOUD_API_KEY: ${{ secrets.TF_VAR_CONFLUENT_CLOUD_API_KEY }}
TF_VAR_CONFLUENT_CLOUD_API_SECRET: ${{ secrets.TF_VAR_CONFLUENT_CLOUD_API_SECRET }}
# Tell Terraform it's running in CI/CD
TF_IN_AUTOMATION: true
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
name: Checkout source code
- uses: actions/cache@v3
name: Cache plugin dir
with:
path: ~/.tflint.d/plugins
key: tflint-${{ hashFiles('.tflint.hcl') }}
- uses: terraform-linters/setup-tflint@v3
name: Setup TFLint
with:
tflint_version: v0.44.1
- name: Show version
run: tflint --version
- name: Init TFLint
run: tflint --init
env:
GITHUB_TOKEN: ${{ github.token }}
- name: Run TFLint
run: tflint -f compact
terraform_plan:
needs: [lint]
name: "Terraform Plan"
runs-on: ubuntu-latest
defaults:
run:
shell: bash
working-directory: ./staging
steps:
- name: Checkout the repository to the runner
uses: actions/checkout@v3
- name: Setup Terraform with specified version on the runner
uses: hashicorp/setup-terraform@v2
with:
terraform_version: 1.3.0
- name: Terraform init
id: init
run: terraform init -backend-config="bucket=$TF_BUCKET_STATE"
- name: Terraform validate
id: validate
run: terraform validate
- name: Terraform plan
id: plan
run: terraform plan -no-color -input=false -var="confluent_cloud_api_key=$TF_VAR_CONFLUENT_CLOUD_API_KEY" -var="confluent_cloud_api_secret=$TF_VAR_CONFLUENT_CLOUD_API_SECRET"
continue-on-error: true
- uses: actions/github-script@v6
env:
PLAN: "terraform\n${{ steps.plan.outputs.stdout }}"
with:
script: |
const output = `#### Terraform Format and Style 🖌
#### Environment ${{ matrix.environment }} (CI)
#### Terraform Initialization ⚙️\`${{ steps.init.outcome }}\`
#### Terraform Validation 🤖\`${{ steps.validate.outcome }}\`
#### Terraform Plan 📖\`${{ steps.plan.outcome }}\`
<details><summary>Show Plan</summary>
\`\`\`\n
${process.env.PLAN}
\`\`\`
</details>
*Pushed by: @${{ github.actor }}, Action: \`${{ github.event_name }}\`*`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: output
})
- name: Terraform Plan Status
if: steps.plan.outcome == 'failure'
run: exit 1
Next, copy the following GitHub Action workflow in a cd.yml file. This file contains the CD workflow which will run when you merge a pull request. It applies the terraform plan to all environments.
name: CD
on:
pull_request:
types:
- closed
paths-ignore:
- '**/README.md'
env:
# We're using AWS S3 as the Terraform backend
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
TF_BUCKET_STATE: ${{ vars.TF_BUCKET_STATE }}
# Credentials for Confluent Cloud
TF_VAR_CONFLUENT_CLOUD_API_KEY: ${{ secrets.TF_VAR_CONFLUENT_CLOUD_API_KEY }}
TF_VAR_CONFLUENT_CLOUD_API_SECRET: ${{ secrets.TF_VAR_CONFLUENT_CLOUD_API_SECRET }}
# Tell Terraform it's running in CI/CD
TF_IN_AUTOMATION: true
jobs:
deploy_streaming_platform:
name: "Deploy ${{ matrix.environment }} platform"
if: github.event.pull_request.merged
strategy:
max-parallel: 1
matrix:
environment: [ staging, prod ]
runs-on: ubuntu-latest
steps:
- name: Checkout the repository to the runner
uses: actions/checkout@v3
- name: Check terraform files existence
id: check_files
uses: andstor/file-existence-action@v2
with:
files: "${{ matrix.environment }}/*.tf"
fail: true
- name: Setup Terraform with specified version on the runner
uses: hashicorp/setup-terraform@v2
with:
terraform_version: 1.5.4
- name: Terraform init
id: init
run: terraform init -input=false -backend-config="bucket=$TF_BUCKET_STATE"
working-directory: ${{ matrix.environment }}
- name: Terraform validate
id: validate
run: terraform validate
working-directory: ${{ matrix.environment }}
- name: Terraform workspace
id: workspace
run: terraform workspace select -or-create ${{ matrix.environment }}
working-directory: ${{ matrix.environment }}
- name: Terraform plan
id: plan
run: terraform plan -no-color -input=false
-var="confluent_cloud_api_key=$TF_VAR_CONFLUENT_CLOUD_API_KEY"
-var="confluent_cloud_api_secret=$TF_VAR_CONFLUENT_CLOUD_API_SECRET"
-var="confluent_cloud_environment_name=${{ matrix.environment }}"
continue-on-error: true
working-directory: ${{ matrix.environment }}
- uses: actions/github-script@v6
env:
PLAN: "terraform\n${{ steps.plan.outputs.stdout }}"
with:
script: |
const output = `#### Terraform Format and Style 🖌
#### Environment ${{ matrix.environment }} (CD)
#### Terraform Initialization ⚙️\`${{ steps.init.outcome }}\`
#### Terraform Validation 🤖\`${{ steps.validate.outcome }}\`
#### Terraform Plan 📖\`${{ steps.plan.outcome }}\`
<details><summary>Show Plan</summary>
\`\`\`\n
${process.env.PLAN}
\`\`\`
</details>
*Pushed by: @${{ github.actor }}, Action: \`${{ github.event_name }}\`*`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: output
})
- name: Terraform Apply
if: github.ref == 'main' && github.event.pull_request.merged
run: terraform apply -auto-approve -input=false
-var="confluent_cloud_api_key=$TF_VAR_CONFLUENT_CLOUD_API_KEY"
-var="confluent_cloud_api_secret=$TF_VAR_CONFLUENT_CLOUD_API_SECRET"
-var="confluent_cloud_environment_name=${{ matrix.environment }}"
working-directory: ${{ matrix.environment }}
Finally, here's the promote.yml file below. This workflow copies the content of the staging directory to the prod directory and runs whenever code is pushed to the main branch. We assume that you'll only make pull requests with changes in the staging directory or in the prod/specific directory.
name: Promote staging to prod
run-name: ${{ github.actor }} wants to promote Staging to Prod 🚀
on:
push:
branches:
- main
paths-ignore:
- '**/README.md'
jobs:
promote:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- run: echo "🔎 Compare and sync directories"
- name: Compare and sync directories
run: |
# Exclude the 'specific' directory and compare staging and prod
DIFF=$(rsync --stats --recursive --dry-run --exclude=specific ./staging/ ./prod/ | awk '/files transferred/{print $NF}')
# If DIFF is not 0, means there are differences
if [ "$DIFF" -ne 0 ]; then
# If there are differences, use rsync to copy files excluding the 'specific' directory
rsync --checksum --recursive --exclude=specific ./staging/ ./prod/
fi
- name: Set current date as env variable
run: echo "NOW=$(date +'%Y-%m-%d-%H%M%S')" >> $GITHUB_ENV
- name: Create Pull Request
uses: peter-evans/create-pull-request@v5
with:
token: ${{ secrets.GITHUB_TOKEN }}
commit-message: Sync staging to prod
title: "Sync staging to prod ${{env.NOW}}"
body: This PR is to sync the staging directory to the prod directory.
branch: "sync-staging-to-prod-${{env.NOW}}"
We also need to create a specific subdirectory in each environment directory if we need environment specific code which will escape the copying from staging to prod done by the promote action.
cd ../..
mkdir -p staging/specific && touch staging/specific/.gitkeep
mkdir -p prod/specific && touch prod/specific/.gitkeep
The last file we need is an AWS policy to grant read/write access to the S3 bucket which will store the Terraform State. We'll use it later when we do the initial setup.
Create the following s3_bucket_full_access_policy.json at the top of your repository directory:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::platform-engineering-terraform-state",
"arn:aws:s3:::platform-engineering-terraform-state/*"
]
}
]
}
Now, commit and push your changes (change your_user with your own username)
git init
git add .github staging prod *.json
If you run git status this is what you should get:
On branch main
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: .github/workflows/cd.yml
new file: .github/workflows/ci.yml
new file: .github/workflows/promote.yml
new file: prod/specific/.gitkeep
new file: s3_bucket_full_access_policy.json
new file: staging/main.tf
new file: staging/specific/.gitkeep
Before committing and pushing those changes, we must allow GitHub Actions to create pull requests. Under the "Actions -> General -> Workflow Permissions" settings of your freshly created GitHub repository:
With these two options enabled, our promote workflow will be able to create pull requests.
Run the following aws CLI commands to create the S3 bucket to store the state:
aws s3api create-bucket --bucket "platform-engineering-terraform-state"
aws iam create-user --user-name "terraform-user"
aws iam create-policy --policy-name S3FullAccessPolicy --policy-document file://s3_bucket_full_access_policy.json
# arn:aws:iam::321751332100:policy/S3FullAccessPolicy
aws iam attach-user-policy --policy-arn arn:aws:iam::321751332100:policy/S3FullAccessPolicy --user-name terraform-user
Add the following entries in your GitHub repo settings, under "Secrets and Variables > Actions":
Now, let's commit and push this code to GitHub.
git checkout -b automation
git commit -m "Automate the platform creation"
git remote add origin git@github.com:your_user/data-streaming-platform-pipeline.git
git push -u origin automation
Once you've pushed, open a Pull Request for the automation branch. Doing so will trigger the CI workflow.
The CI workflow lints the Terraform code and gives you a preview of the plan directly in the PR as a comment:
When you merge this Pull Request, the promote workflow will trigger and create another Pull Request which copies the whole content of the staging directory to the prod directory, except any files sitting in the specific subdirectory.
If you open this Pull Request, you can see that the proposed change is the addition of the main.tf file under the prod directory, as expected.
As soon as you'll merge that promotion Pull Request, the CD workflow will trigger a job to Terraform all environments:
Now, if you head over to Confluent Cloud, you can see that both environments have been created and have the same exact configuration.
Congratulations!
We will only share developer content and updates, including notifications when new content is added. We will never send you sales emails. 🙂 By subscribing, you understand we will process your personal information in accordance with our Privacy Statement.