Leveraging GitOps principles with ArgoCD to boost team efficiency

by Đorđe Mijailović 03/10/2022

We build it – we run it! Easier said than done, especially if you are in a very small team designing, developing and managing multiple applications in several Kubernetes clusters. In order for 5 people to develop data pipelines, backends, frontends, testing, infrastructure and all other aspects of a software product, we needed to find ways to organise and automate as much as we can to save time. In this post, I will share our approach to managing all those apps in several Kubernetes clusters, across multiple environments using GitOps principles and ArgoCD as a tool for it.

This article is not a tutorial on how to install or configure ArgoCD. This is just me sharing thoughts about this tool after using it in a real-world project for a year.

What problems have we faced?

In our project, applications have a lot of dependencies. For example, there are several spring boot services, several Nginx serving frontends, few instances of Keycloak, RabbitMQ, Apache Airflow with about 20 data processing pipelines, Traefik ingress controllers, Cert-manager, multiple databases, Vault, logging and monitoring stack, complex network policies, etc.

Besides regularly working on features of our own apps, all of the dependencies listed above needed frequent updates and reconfigurations. Initially, we were doing all deployments manually by “helm install/upgrade”. It was so time consuming and sometimes even exhausting to run all those commands and scripts and keep track of what was changed and where. Later we started using custom Jenkins pipelines to deploy Helm charts imperatively. It helped, but we have caught ourselves asking all of these questions more than we really wanted:

  1. OK, we have infrastructure as code in the form of charts and value files stored in Git, but are those really applied to clusters?
  2. What is the current version of application X and chart Y in environment Z?
  3. Do we really want to write Jenkins files to imperatively push changes for all applications as an alternative to manual helming?
  4. We made a mistake. How to easily undo changes and make sure that we undid all of them?
  5. Who did that change manually and why?
  6. I need someone to review my deployments before applying them to production. How to achieve this?
  7. We need to migrate to a new cluster. OMG, just do not make me do 40+ helm installs with all those vars and secrets! I’ll quit!
  8. We have a new team member. How to explain all those Jenkins files or manual deployment procedures? It takes ages!
  9. I’m experimenting with deployment of a new app (e.g. setting up cert-manager with Let’s encrypt and Traefik: stay tuned for that post!). How to track different combinations of params that I have tried?
  10. How to audit cluster changes in easily reviewable form?
  11. How to easily spawn and clean up entire environments for testing or preview purposes?
  12. List goes on and on.

What is GitOps?

My definition: GitOps, for my team, is the solution for all problems listed above!

If we go for a longer, more official definition, you might say that GitOps is all about declaring infrastructure in a git repository using all available git mechanisms (e.g. branching, pull requests, reviews) and not worrying about how those changes will be applied to your cluster.

Git is the source of truth! Kubernetes operator (e.g. ArgoCD) will apply the declared state from Git to clusters automatically.

Following image displays the usual (but a little bit simplified) flow of infrastructure changes in our project. In it, it’s obvious that developers only interact with Git, and from that point on, ArgoCD takes care of deploying declared state to a cluster.

Example, please!

To deploy an application to a cluster using ArgoCD, you just define a simple application manifest yaml with source and destination sections. Source defines what is being deployed. In our example, it’s Vault helm chart with some parameters stored in a certain git repository in a vault folder (path). Destination defines where, to which cluster and namespace Kubernetes resources will be applied. There are also sync options to choose the level of automation:

After committing this to git and merging to master (possibly via Pull Request and review from your colleagues), web hook will tell ArgoCD that the repository changed. ArgoCD will then automatically fetch the repository and will make sure that desired state is deployed to the cluster. Application will magically (without page refresh) appear in the apps list in UI:

Then, when you click on the application card, you can see the tree of resources deployed and also inspect details (e.g. events or logs from pods):

If everything is OK and the applications are ready, a pleasant green heart icon will let you know. If there would be some error, you can easily rollback to the previous state just by reverting git commit and wait a few seconds for Argo to synchronise the cluster.

History of changes can be reviewed in the git log and the current state of the cluster will be always in sync with the git repository.

So, the Vault application is deployed without a single “kubectl” or “helm” command executed. No need to distribute sensitive cluster credentials, no need to install CLI, no need to struggle with VPN-s if your cluster is somewhere in the Intranet. To delete it, just delete it from Git and the app will be removed fully.

Let’s answer the questions

After we demonstrated in this simple example how GitOps idea works in the ArgoCD world, let’s try to answer our own questions from the beginning of this article.

 

  1. OK, we have infrastructure as code in the form of charts and value files stored in Git, but are those really applied to clusters?

They always are by definition. If there is an error, you will know it! ArgoCD provides a useful Grafana dashboard on top of which alerts can be built. Also, if you do not have Grafana, ArgoCD has its own notification solution.

  1. What is the current version of application X and chart Y in environment Z?

It’s in the Application manifest file in Git. Just check the repo.

  1. Do we really want to write Jenkins files to imperatively push changes for all applications as an alternative to manual helming?

No more imperative approach. Just declare what you want. ArgoCD knows how to make it happen.

  1. We made a mistake. How to easily undo changes and make sure that we undid all of it?

Just revert the git commit and wait for a few seconds.

  1. Who did that change manually and why?

Manual changes to apps are (optionally) healed by Argo so repo stays in sync with cluster state even if someone manually changes stuff using kubectl.

  1. I need someone to review my deployments before applying them to production. How to achieve this?

This is achieved simply by creating pull requests in a repository where ArgoCD apps are defined. So, again, everything is just plain old Git.

  1. We need to migrate to a new cluster. OMG just do not make me do 40+ helm installs with all those vars and secrets! I’ll quit!

It’s simple. Just define a new “destination” in your app manifest and it’s done! If application data migration is needed, that is out of scope of ArgoCD but you may explore ArgoCD Hooks.

  1. We have a new team member. How to explain all those Jenkins files or manual deployment procedures? It takes ages!

Not anymore! New team members can have a very good overview of what apps are in a cluster and how they are configured by just checking git or ArgoCD UI. To deploy a new app, they would just commit a new application manifest. No need to adopt some custom solution like creating pipelines or doing cumbersome manual deployments to cluster.

  1. I’m experimenting with deployment of a new app. How to track different combinations of params that I have tried?
  2. How to audit cluster changes in easily reviewable form?

Answer to both questions is: git log!

  1. How to easily spawn and clean up entire environments for testing or preview purposes?

In our project, we have several permanent environments (dev, test, prod). All of those are mirror images of each other and they reside in different folders in git. Spawning another environment is a matter of duplicating the dev environment’s folder with adjusting of namespaces in app manifests. This can also be automated. Deleting such environments is done by simply deleting a folder in git.

Following features are, in my opinion, what also makes this tool so great and worth investigating:

  • The UI is clean and minimalistic. It’s very easy to follow changes while they automatically happen in the cluster if git state changes
  • Enables fast and easy overview of Kubernetes events, application logs
  • Supports static yaml files, Helm, Kustomize, Jsonnet, Ksonnet
  • Supports App-of-Apps pattern, so it’s possible to manually define only one ArgoCD Application which deploys all other applications defined in some folder in git repository. This makes deployments even more automated so I would definitely recommend this approach for anybody using ArgoCD
  • Gives you great overview of cluster state with intuitive filtering options
  • Displays familiar diff view between state in git and in cluster
  • Heals any changes made externally (e.g. using kubectl)
  • Does not store state in external DB nor volumes. State is entirely defined in Kubernetes objects stored in etcd
  • Easily integrated with (not just) Keycloak’s OIDC including Keycloak roles that can be mapped to ArgoCD roles (admin and read only)
  • Has alerting mechanism and very usable prometheus metrics
  • Can manage more than one cluster at same time
  • Has concept of projects to define strict rules of what can be deployed to which cluster

ArgoCD and Helm

As mentioned, in our project, Helm was used before ArgoCD. When we started using ArgoCD, in order not to change too much, we wanted to continue to use Helm charts. Instead of deploying charts manually using helm install or helm update commands we have used ArgoCD’s ability to generate static yaml-s using Helm just as a template engine before deploying them to cluster. ArgoCD even has its own hooks and dynamically turns Helm hooks into ArgoCD hooks which works great, so charts did not need any modification to be run by ArgoCD. Our idea is to also give a chance to Jsonnet as an alternative to a Helm to simplify things further.

What we didn’t like about ArgoCD

There wasn’t much that we didn’t like, but few things are worth pointing out.

Recently a major security issue was discovered. Fortunately, it did not have any impact to us since we do run only trusted charts, but it’s always good to consider potential security issues when using applications such as ArgoCD, which has the ability to access sensitive information in your repository and cluster.

Sometimes, given the right conditions, ArgoCD might issue many API calls to Kubernetes API. In recent versions, this was optimised, but it’s still something to bear in mind.

There are several ways of managing secrets in ArgoCD. For our use-case we have chosen Hashicorp Vault with argocd-vault-plugin. It appears that in ArgoCD and Vault plugin versions that we are using, passwords are cached, so if they are changed in Vault, change will not be detected by ArgoCD unless pulling new password is forced by changing something else in the application manifest (e.g. name, label or any other value). Even a missing password is cached, so error persists if you add password to Vault later. This might be corrected in later versions, but that is still to be tried.

Conclusion

After using it every day for almost a year, I cannot imagine going back to the manual process of deploying apps to Kubernetes clusters. The amount of time and energy saved is precious for us! It rarely happens that a single tool solves so many problems at the same time, and does it through ubiquitous tools like Git which means that the learning curve is very gentle. It was adopted very quickly by all team members. Trust towards ArgoCD was developed very fast and now we do not even need to watch UI when applying changes. We just commit the desired state to Git repo and forget about it! If there is any problem – Argo will let us know via a Slack notification. I strongly recommend ArgoCD to any team dealing with problems similar to ours. For us, it did miracles. Maybe it can solve some of your problems too.

Leave a reply

Your email address will not be published.