Where are all the people going
Round and round till we reach the end
One day leading to another
Get up, go out, do it again
Do It Again, The Kinks
Introduction
If you manage multiple Kubernetes or OpenShift clusters long enough, particularly ephemeral clusters which come and go, you’ve probably experienced that “Do it Again” feeling of monotonously repeating the same tasks over and over again to provision and setup a cluster. This is where GitOps comes into play helping automate those tasks in a reliable and consistent fashion.
First off, what is GitOps and why should you care about it? Simply GitOps is the process of continuously reconciling the state of a system with the state declared in a Git repository, at the end of the day that’s all it does. But buried in that simple statement, coupled with the declarative nature of Kubernetes, is what enables you to build, deliver, update and manage clusters at scale reliability and effectively and that’s why you should care.
Essentially in a GitOps approach the state of our cluster configuration is stored in git and as changes in git occur a GitOps tool will automatically update the cluster state to match. Just as a importantly, if someone changes the state of a cluster directly by modifying or deleting a resource via kubectl/oc or a GUI console the GitOps tool can automatically bring the cluster back in line with the state declared in git.
This can be thought of as a reconciliation loop where the GitOps tool is constantly ensuring the state of the cluster matches the declared state in git. In organization’s where configuration drift is a serious issue this capability should not be under-estimated in terms of daramatically improving reliability and consistency to cluster configuration and deployments. It also provides a strong audit trail of changes since every cluster change is represented by a git commit.
The concept of managing the state of a system in Git is not new, developers have been using source control for many years. On the operations side the concept of “Infrastructure as Code” has also existed for many years with middling success and adoption.
What’s different now is Kubernetes which provides a declarative rather then imperative platform and the benefits of being able to encapsulate the state of a system and have the system itself be responsible for matching this desired state is enormous. This almost (but not quite completely) eliminates the need for complex and often brittle imperative type scripts or playbooks to manage the state of the system that we often saw when organizations attempted “Infrastructure as Code”.
In a nutshell Kubernetes provides it’s own reconciliation loop, it’s constantly ensuring the state of the cluster matches the desired declared state. For example, when you deploy an application and change the number of replicas from 2 to 3 you are changing the desired state and the kubernetes controller is responsible for making that happen. At the end of the day GitOps is doing the same thing but just taking it one level higher.
This is why GitOps with Kubernetes is such a good fit that it becomes the natural approach.
Tools of the Trade
Now that you have hopefully been sold on the benefits of adopting a GitOps approach to cluster configuration let’s look at some of the tools that we will be using in this article.
Kustomize. When starting with GitOps many folks begin with storing raw yaml in their git repository. While this works it quickly leads to a lot of duplication (i.e. copy and paste) of yaml as one needs to tweak and tailor the yaml for specific use cases, environments or clusters. Over time this quickly becomes burdensome to maintain leading folks to look at alternatives. Typically there are two choices that folks typically gravitate towards: Helm or Kustomize.
Helm is a templating framework that provides package management of applications in a kubernetes cluster. Kustomize on the other hand is not a templating framework but rather a patching framework. Kustomize works by enabling developers to inherit, commpose and aggregate yaml and make changes to this yaml using various patching strategies such as merging or JSON patching. Since it is a patching framework, it can feel quite different to those used to a more conventional templating frameworks such as Helm, OpenShift Templates or Ansible Jinja.
Kustomize works on the concept of bases and overlays. Bases are essentially, as the name implies, the base raw yaml for a specific functionality. For example I could have a base to deploy a database into my cluster. Overlays on the other hand inherit from one or more bases and is where bases are patched for specific environments or clusters. So taking the previous example, I could have a database base for deploying MariaDB and an overlay that patches that base for an environment to use a specific password.
My strong personal preference is to use kustomize for gitops in enterprise teams where the team owns the yaml. One recommendation I would have when using kustomize to come up with an organizational standard for folder structure of bases and overlays in order to provide consistentcy and readability across repos and teams. My personal standard that we will be using in this article is located in my standards repository. By no means am I saying this standard is the one true way, however regardless of what standard you put in place having a standard is critical.
ArgoCD. While kustomize helps you organize and manage your yaml in git repos, we need a tool that can manage the GitOps integration with the cluster and provide the reconciliation loop we are looking for. In this article we will focus on ArgoCD, however there are a plethora of tools in this space including Flux, Advanced Cluster Management (ACM) and more.
I’m using ArgoCD for a few reasons. First I like it. Second it will be supported as part of OpenShift as an operator called OpenShift GitOps. For OpenShift customers with larger scale needs I would recommend checking out ACM in conjunction with ArgoCD and the additional capabilities it brings to the table.
ArgoCD
Some key concepts to be aware of with ArgoCD include:
- Applications. ArgoCD uses the concept of an Application to represent an item (git repo + context path) in git that is deployed to the cluster, while the term Application is used this does not necessarily correspond 1:1 to an application. The deployment of set of Roles and RoleBindings to the cluster could be an application, an operator subscription could be an Application, a three tier app could be a single application, etc. Basically don’t get hung up on the term Application, it’s really just the level of encapsulation.
In short, at the end of the day an Application is really just a reference to a git repository as per the example below:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: config-groups-and-membership
spec:
destination:
namespace: argocd
server: https://kubernetes.default.svc
project: cluster-config
source:
path: manifests/configs/groups-and-membership/overlays/default
repoURL: https://github.com/gnunn-gitops/cluster-config.git
targetRevision: master
- Projects. As per the ArgoCD website, “Projects provide a logical grouping of applications” which can be useful when organizing applications deployed into ArgoCD. It is also where you can apply RBAC and restrictions around applications in terms of the namespaces where applications can be deployed, what k8s APIs they can use, etc. In general I primarily use projects as an organization tool and prefer the model of deploying separate namespace scoped instances of ArgoCD on a per team level (not per app!) to provide isolation.
- App of Apps. The “App of App” pattern refers to using an ArgoCD application to declaratively deploy other ArgoCD applications. Essentially you have an ArgoCD application that points to a git repository with other ArgoCD applications in them. The benefit of this approach is it enables you to deploy a single application to deliver a wide swath of functionality without having to deploy each application individually. That’s correct, it’s turtles all the way down. Note though that at some point in the future that the App of Apps pattern will likely be replaced by ApplicationSets.
- Sync Waves. In Kubernetes there is often a need to handle dependencies, i.e. to deploy one thing before another. In ArgoCD this capability is provided by sync waves which enables you to annotate an application with the wave number it is part of. This is particularly powerful with the “App of App” pattern where we can use it to deploy our applications in a particular order which we will see when we do the cluster configuration (I’m getting there, I promise!)
Sealed Secrets. When you first start with GitOps the first reaction is typically “Awesome, we are storing all our stuff in git” shortly followed by “Crap we are storing all of our stuff in git including secrets”. To use GitOps effectively you need a way to either manage your secrets externally from git or encrypt them in git. There are a huge number of tools available for this, in Red Hat Canada we’ve settled on Sealed Secrets as it provides a straightforward way to encrypt/decrypt secrets in git and is easy to manage for our demos. Having said that we don’t have a hard and fast recommendation here, if your organization has an existing secret management solution (i.e. like Hashicorps Vault for example) I would highly recommend looking at using that as a first step.
Sealed Secrets runs as a controller in the cluster that will automatically decrypt a SealedSecret CR into a corresponding Secret. Secrets are encrypted using a private key which is most commonly associated to a specific cluster, i.e. the production cluster would have a different key then the development cluster. Secrets are tied to a namespace and can only be decrypted in the namespace for which they are intended. Finally a CLI called kubeseal allows users to quickly create a new SealedSecret for a particular cluster and namespace.
Bringing it all Together
With the background out of the way, let’s talk about bringing it all together to manage cluster configuration for GitOps. Assuming you have a freshly installed cluster all shiny and gleaming, the first step is to deploy ArgoCD into the cluster. There’s always a bit of a chicken and egg here in that you need to get the GitOps tool deployed before you can actually start GitOps’ing. For simplicity we will deploy ArgoCD manually here using kustomize, however a more enterprise solution would be to use something like ACM which can push out Argo to clusters on it’s own.
The Red Hat Canada GitOps organization has a repo with a standardized deployment and configuration of ArgoCD that we share in our team thanks to the hard work of Andrew Pitt. Our ArgoCD configuration includes resource customizations and exclusions that we have found made sense in our OpenShift environments. These changes help ArgoCD work better with certain resources to detemine if an application is in or out of sync.
To deploy ArgoCD to a cluster, you can simply clone the repo and use the include setup.sh script which deploys the operator followed by the ArgoCD instance in the argocd namespace.
Once you have ArgoCD deployed and ready to go you can actually start creating a cluster configuration repository. My cluster configuration is located in github at https://github.com/gnunn-gitops/cluster-config, my recommendation would be to start from scratch with your own repo rather then forking mine and slowly build it up to meet your needs. Having said that, let’s walk through how my repo is setup as an example.
The first thing you will notice is the structure with three key folders at the root level: clusters, environments and manifests. I cover these extensively in my standards document but here is a quick recap:
- manifests. A base set of kustomize manifests and yaml for applications, operators, configuration and ArgoCD app/project definitions. Everything is inherited from here
- environments. Environment specific aggregation and patching is found here. Unlike app environments (prod/test/qa), this is meant as environments that will share the same configuration (production vs non-production, aws versus azure, etc). It aggregates the argocd applications you wish deployed with the next level in the heirarchy clusters, using an app of app pattern.
- clusters. Cluster specific configuration, it does not directly aggregate the environments but instead employs an app-of-app pattern to define one or more applications that point to the environment set of applications. It also includes anything that needs to be directly bootstrapped, i.e. a specific sealed-secrets key as an example.
The relationship between these folders is shown in the diagram above. The clusters folder can consume kustomize bases/overlays from both environments and manifests while environments can only consume from manifests, never clusters. This organizational rule helps keep things sane and logical.
So let’s look in a bit more detail how things are organized. So if you look at my environments folder you will see three overlays are present: bootstrap, local and cloud. Local and cloud represent my on-prem and cloud based environments, but what’s bootstrap and why does it exist?
Regardless of the cluster you are configuring, there is a need to bootstrap some things directly in the cluster outside of a GitOps context. If you look at the kustomization file you will see there are two items in particular that get bootstrapped directly:
- ArgoCD Project. We need to add an ArgoCD project to act as a logical grouping for our cluster configuration. In my case the project is called cluster-config
- Sealed Secret Key. I like to provision a known key for decrypting my SealedSecret objects in the cluster so that I have a known state to work from rather then SealedSecret generating a new key on install. This also makes it possible to restore a cluster from scratch without having to re-encrypt all the secrets in git. Note that the kustomization in bootstrap references a file sealed-secrets-secret.yaml which is not in git, this is the private key and is essentially the keys to the kingdom. I include this file in my .gitignore so it never gets accidentally committed to git.
Next if you examine the local environment kustomize file, notice that it is importing all of the ArgoCD applications that will be included in this environment along with any specific environment patching required.
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: argocd
bases:
- ../../../manifests/argocd/apps/sealed-secrets-operator/base
- ../../../manifests/argocd/apps/letsencrypt-certs/base
- ../../../manifests/argocd/apps/storage/base
- ../../../manifests/argocd/apps/alertmanager/base
- ../../../manifests/argocd/apps/prometheus-user-app/base
- ../../../manifests/argocd/apps/console-links/base
- ../../../manifests/argocd/apps/helm-repos/base
- ../../../manifests/argocd/apps/oauth/base
- ../../../manifests/argocd/apps/container-security-operator/base
- ../../../manifests/argocd/apps/compliance-operator/base
- ../../../manifests/argocd/apps/pipelines-operator/base
- ../../../manifests/argocd/apps/web-terminal-operator/base
- ../../../manifests/argocd/apps/groups-and-membership/base
- ../../../manifests/argocd/apps/namespace-configuration-operator/base
patches:
- target:
group: argoproj.io
version: v1alpha1
kind: Application
path: patch-application.yaml
- target:
group: argoproj.io
version: v1alpha1
kind: Application
name: config-authentication
path: patch-authentication-application.yaml
Now if we move up to the clusters folder you will see two folders at the time of this writing, ocplab and home, which are the two clusters I typically manage. The ocplab cluster is an ephemeral cluster that is installed and removed periodically in AWS, the home cluster is the one sitting in my homelab. Drilling into the clusters/overlays/home folder you will see the following sub-folders:
The apps and configs folders mirror the same folders in manifests, these are apps and configs that are specific to a cluster or ones that need to be patched for a specific cluster. If you look at the argocd folder and drill into cluster-config/clusters/overlays/home/argocd/apps/kustomization.yaml file you will see the kustomization as follows:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../../../../environments/overlays/local
resources:
- ../../../../../manifests/argocd/apps/cost-management-operator/base
patches:
# Patch console links for cluster routes
- target:
group: argoproj.io
version: v1alpha1
kind: Application
name: config-console-links
path: patch-console-link-app.yaml
# Patch so compliance scan only runs on masters and doesn't get double-run
- target:
group: argoproj.io
version: v1alpha1
kind: Application
name: config-compliance-security
path: patch-compliance-operator-app.yaml
# Path cost management to use Home source
- target:
group: argoproj.io
version: v1alpha1
kind: Application
name: config-cost-management
path: patch-cost-management-operator-app.yaml
Notice this is inheriting the local environment as it’s base so it’s pulling in all of the ArgoCD applications from there and applying cluster specific patching as needed. Remember way back when we talked about the App of App pattern? Let’s look at that next.
Brining up the /clusters/overlays/home/argocd/manager/cluster-config-manager-app.yaml file, this is the App of App which I typically suffix the name with “-manager” since it manages the other applications. This file appears as follows:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: cluster-config-manager
labels:
gitops.ownedBy: cluster-config
spec:
destination:
namespace: argocd
server: https://kubernetes.default.svc
project: cluster-config
source:
path: clusters/overlays/home/argocd/apps
repoURL: https://github.com/gnunn-gitops/cluster-config.git
targetRevision: master
syncPolicy:
automated:
prune: false
selfHeal: true
Note that the path is telling ArgoCD to deploy what we looked at earlier, i.e. where all of the cluster applications are defined by referencing the local environment. Thus deplying this manager application pulls in all of the other applications and deploys them as well, so running this single command:
kustomize build clusters/overlays/home/argocd/manager | oc apply -f -
Results in this:
Now as mentioned, all of the cluster configuration is deployed in a specific order using ArgoCD sync waves. In this repository the following order is used:
Wave |
Item |
1 |
Sealed Secrets |
2 |
Lets Encrypt for wildcard routes |
3 |
Storage (iscsi storageclass and PVs) |
11 |
Cluster Configuration (Authentication, AlertManager, etc) |
21 |
Operators (Pipelines, CSO, Compliance, Namespace Operator, etc) |
You can see these waves defined as annotations in the various ArgoCD applications, for example the sealed-secrets application has the following:
annotations:
argocd.argoproj.io/sync-wave: "1"
Conclusion
Well that brings this entry to a close, GitOps is a game changing way to manage your clusters and deploy applications. While there is some work and learning involved in getting everything set up once you do it you’ll never want to go back to manual processes again.
If you are making changes in a GUI console you are doing it wrong
Me
Acknowledgements
I want to thank my cohort in GitOps, Andrew Pitt. A lot of the stuff I talked about here comes from Andrew, he did all the initial work with ArgoCD in our group and was responsible for evangelizing it. I started with Kustomize, Andrew started with ArgoCD and we ended up meeting in the middle, perfect team!