The Problem We All Know Too Well

At Collectors, our Kafka footprint grew faster than our ability to track it. Topics were created ad-hoc through the CLI, retention policies varied by whoever set them up, and when audit time came, we had no clear picture of what existed or why. We needed a better approach—one that gave us visibility, consistency, and version control without slowing down developers.

Enter GitOps for Kafka

GitOps has revolutionized how we manage Kubernetes resources, so why not apply the same principles to Kafka? The core idea is simple yet powerful: treat your Kafka topics, ACLs, users, and configurations as code that lives in Git alongside your application code. Every change goes through a pull request, every modification is tracked, and your Git repository becomes the single source of truth for your Kafka resources.

But here’s where it gets interesting. Instead of learning proprietary tooling or wrestling with Kafka’s native APIs, what if your developers could define Kafka resources using the same declarative YAML format they already know from Kubernetes? This is where Crossplane enters the picture.

Why This Matters for Platform Engineering

Platform engineering is fundamentally about removing friction for developers while maintaining operational excellence. When developers can manage their Kafka resources in the same repository as their application code, using familiar tools and workflows, several things happen:

  • Ownership and autonomy
  • Consistency across environments
  • Self-service without chaos

The Technical Architecture

Our GitOps approach leverages three key components:

  1. Crossplane Kafka Provider – The control plane that extends Kubernetes to manage external resources, and the kafka provider from the Upbound Marketplace
  2. ArgoCD – To continuously sync our desired state from Git to the cluster
  3. Helm – For templating and managing Kafka resources across multiple environments

This approach is completely cloud-agnostic. Since Crossplane works at the Kafka protocol level, it manages topics, ACLs, and configurations regardless of where Kafka runs—AWS MSK, Confluent Cloud, GCP, or your own on-premises clusters. The same GitOps workflow applies everywhere.

Why Crossplane over Terraform?

While Terraform excels at infrastructure provisioning, Crossplane offers key advantages for ongoing resource management:

  • Continuous reconciliation and self-healing: Crossplane runs as a Kubernetes controller, constantly monitoring and auto-correcting drift without manual runs or CI/CD triggers.
  • Native Kubernetes integration: Your Kafka resources become custom resources following patterns your team already knows—no separate tooling or workflows to learn.
  • ArgoCD’s UI and observability: Engineers get real-time visibility into sync status, health checks, and drift detection through a visual interface—no CLI required.

Prerequisites and Setup

Before diving into the resource definitions, here’s what you need to configure:

MSK Cluster Configuration

Your Amazon MSK cluster needs IAM authentication enabled. This will allow Crossplane to authenticate using an IAM role rather than managing separate credentials. You’ll need a cluster policy that grants the Crossplane provider IAM role access to manage cluster resources:

Crossplane Kafka Provider Installation

You’ll need to install and configure the Crossplane Kafka provider in your Kubernetes cluster. The provider uses the dedicated IAM role to authenticate with your MSK cluster, handling all the heavy lifting of interacting with Kafka’s APIs.

Defining Your Kafka Resources

Now for the good part – here’s what a complete Kafka resource definition looks like for an application. This single YAML file lives in your application repository and defines everything your service needs:

Notice how intuitive this is. You define your topics with their partition count, replication settings, and Kafka-specific configurations like retention periods. You declare the users (principals) your application needs. Everything is readable, everything is versioned.

Here’s where security and least privilege come into play. Instead of granting blanket permissions, we define explicit ACLs for each principal. This follows the security best practice of giving each component only the permissions it needs and nothing more:

The GitOps Workflow in Action

One Pattern, Any Platform

Managing Kafka resources through GitOps isn’t just about adopting a trendy practice—it’s about solving real operational pain points. By combining Crossplane’s powerful resource management with familiar Kubernetes-native tooling and GitOps principles, we’ve created a system where Kafka infrastructure is as manageable and auditable as application code. Because Crossplane interacts directly with Kafka’s protocol rather than cloud-specific APIs, this approach works everywhere: AWS MSK, GCP-hosted Kafka, Confluent Cloud, or on-premises clusters. One pattern, any platform.

The YAML definitions are approachable for developers, the security model enforces least privilege, and the Git workflow provides the audit trail and collaboration features your team already knows. Most importantly, your Git repository becomes the definitive source of truth for your Kafka infrastructure, eliminating the drift and confusion that comes with manual management.


Alonso Parasxidis

Alonso Parasxidis is an experienced professional specializing in cloud computing, DevOps, and Kubernetes. With a strong background in designing and implementing scalable, cloud-native solutions, he is passionate about empowering teams to adopt modern infrastructure and automation practices.

Author posts
  © 2022 Collectors Holdings, Inc. All rights reserved.  |   Legal & Privacy

Privacy Preference Center