Just watch the first 2.5 minutes of the following video on youtube and see if you can relate to the story Yevgeniy Brikman from Gruntwork presents.
For me it felt like he told my story. My daily work revolves around automation and CI/CD. Due to the nature of an ever changing IT landscape we advise our clients and implement incremental improvements to the infrastructure and code base of our clients. Just recently I worked with one of our clients on their journey towards better CI/CD and I would love to share that story from my CI/CD perspective with you.
Let’s start with the infrastructure we had at the start of our journey
- a few VMs to deploy our application to
In addition to my role in the area of CI/CD we develop part of the Java EE application for our client. Since the code base is covered by our software developers we created a Maven project to generate our artifact, a war file. Next up we implemented a first iteration of our GitLab CI pipeline to handle build and upload of artifacts to our artifact repository – in our case a Nexus instance. Every now and then we downloaded and deployed a stable version of the application to our VMs running the JBoss Middleware. As manual deployments tend to be error prone and take up time that could be used more efficiently we created a new iteration of our deployment process and automate all manual steps with Ansible. With faster deployment times and less potential for mishaps we had time to move forward with the client’s strategy: Container!
In another iteration of our CI pipeline we added the build of ready to deploy container images with the application already backed in. The Ansible Playbook got updated to manage Docker containers and everything was fine, right? No, after a while we noticed that pure Docker is not that fun to manage and decided to introduce a container orchestrator. Together with the Ops Team we installed a Rancher Kubernetes cluster. Once again we modified our pipeline to handle the task at hand – this time around we added a step to create Helm Charts, which hold our infrastructure code, and upload those to ChartMuseum running on our Rancher cluster. Sounds easy, right? Well… let’s say there were a few bumps on the road.
Our GitLab CI pipeline consists of 5 stages with 7 jobs. Each job belongs to a single stage. All jobs in a single stage run in parallel. Each job uses its own container image. When a job is executed, GitLab starts a container based on the defined image and injects the source code into it. Different jobs use different images to fit the requirements at hand – e.g. which tools we need. One of the benefits of this approach is that the Ops Team does not need to install or maintain the tools for us like they did before when we deployed to VMs.
To save some time – and a lot of disk space – we created a base image containing JBoss and some other binaries that we needed for all our applications. This job only runs if the Dockerfile for this image gets modified in git. To build our images we use Buildah bud (build-using-dockerfile). The credentials we use to upload our images to Nexus are stored in GitLab and only privileged people can view them. Through this we reduce access to credentials while allowing a larger number of employees to improve our images. Both IT Security and Application Development benefit from it. IT Security can breathe a little bit easier knowing that the secret is only used by the pipeline with little to no access from employees while employees working with the base image can easily provide patches to the base image.
We create our war file with Maven and run unit tests. However we do not upload the artifacts to Nexus. Instead we save them in the GitLab job itself. GitLab has a neat little feature that allows subsequent jobs to use artefact generated by the previous job.
Since our GitLab runners are running inside our Kubernetes cluster we had some difficulties to cache the Maven dependencies between pipeline runs. By default Maven downloads all dependencies from the internet everytime we do a build since each build runs in a new container. The easiest way to mitigate the issue is to use our existing Nexus as a proxy / cache to reduce build time.
Stage Store for Test
We create our images with Buildah. We use the previously created base image and copy our previously created war file to the new image. Since we need to configure JBoss we use the jboss-cli to execute a cli file. In this cli file an embedded-server is started and configuration is added. The image then gets uploaded to Nexus with the tag “to-test”.
Stage Pre Integration Test
Because we have integration tests and want to see if everything works as expected in Kubernetes we deploy the image using Helm. All needed Kubernetes resources get created and the image tagged with “to-test” is deployed. We use JaCoCo to measure the code coverage. Hence we start JBoss with the JaCoCo java agent enabled. To dump the report we need to open a random NodePort. In addition a RabbitMQ installation is needed for our integration tests which also is installed via Helm Charts.
Stage Integration Test
Integration tests with Maven are executed. Once all tests are executed we dump the coverage report and trigger a SonarQube analyse.
Stage Store for Development
Labeling is a great feature of container technology. With a tool called Skopeo we can copy an image from one container registry to another. In addition to that we can use it to add a tag to an image in a registry without transferring the image over the network first.
In this section of our pipeline we benefit from those features – we can promote an image and add two tags without putting additional traffic on our network. The first tag we add is “develop-latest” the second tag is the hash of the git commit. Why do we add the commit hash you may ask? To release our application, we merge the develop branch into our master branch. Now the latest commit in master has a reference to an image in the registry and we can just add another tag to that image like latest or a version number for better readability.
As previously mentioned we use ChartMuseum to store our Helm charts. Before we upload the Helm chart to ChartMuseum the image tag is set as the app-version in the Helm chart. At installation time Kubernetes knows which image with which tag to pull because we are referencing it in the Helm chart (Chart.yaml -> appVersion).
Lets see how the different tools playing together:
Deployment in Test or Production
In Rancher you have the possibility to add a catalog (a Helm chart repository) and deploy applications which are inside that repository. We add the ChartMuseum as a catalog in Rancher and Rancher automatically starts synchronizing the repository to get the newest charts. The first time our chart is installed there is a bit more work that needs to be done because we have to define all the values. Rancher periodically synchronizes the ChartMuseum and tells us if there is an upgrade available. In the simplest and most common szenario, we can upgrade to the new app version with 3 clicks.
- Build only once
Use the same binaries from test to production
- Install and test in a production like environment
The same tools are used to install the application from CI to production
We automatically test the application in Kubernetes
- Only tested images can be deployed in production
Here be dragons
After the tested images and the Helm charts are uploaded we manually deploy the application using Rancher Apps. Which is fine for now. Personally I would prefer a GitOps approach. E.g if we wanted to change something in the test or production environment we commit it to a Git repository. That change triggers a deployment in the environment automatically. There are tools like ArgoCD or FluxCD which could help us to implement GitOps. But what about secrets? Do you want to store secrets in Git?! To address this issue there are even more tools like sealed secrets or Vault. All those tools are fun to play with. But it can get overwhelming quite fast.
What I had to learn and what I still need to train is that you first need to have a deep understanding of your current processes and requirements. After that choose your tools. And the most important advice: Keep it as simple as possible. 😀