tech-trends

How to - not - DevOps on Openshift

// 18-12-2019

Intro

Agility, for example, was the dominant approach to “project management”. Castles and cathedrals were built without any detailed plan and evolved over the lifetime of the stakeholders, i.e. builders, inhabitants and sponsors. There was very little delay between planning and construction. Progress was delivered continuously, something what we nowadays would call DevOps.

During the industrialisation, however, a mechanistic attitude took root. Work was centrally planned by a management hierarchy and the workflow was compartmentalised into functionally independent units. This might have worked in the traditional manufacturing industry, where knowledge could be centrally managed and production delegated to workers who only had partial know-how necessary to their individual operations. But in the knowledge industry, this approach breaks down.

In software development, we have the traditional separation between development and operations. The software developer writes the code, compiles it, runs some tests and then passes the binary package to the operator who deploys the application on the server. More often than not, some fatal errors occur and a game of blame ping pong unfolds between development and operations arguing who is responsible for the failure. Is it because of buggy code? Or because of wrong configurations?

In response to the problems caused by the compartmentalization of work, DevOps was proposed to join together development and operations that have been unjustly decoupled by a management tradition overly eager to impose hierarchies over a tightly integrated workflow. 

So how do you do DevOps? The common approach is to fully automate operations. We encapsulate the application in a containerised environment (Docker) and isolate its behaviour from the underlying operating system, controlled by configuration files, mostly key-value dictionaries like YAML. Everything becomes code and within the domain of development.

 

``But regardless of what fancy software you intend to employ, there is one tool you can not dismiss: Bash!``

In the ideal case, like on OpenShift (https://www.openshift.com/), the developer pushes his changes to the Git repository and within a couple of minutes a new application is rolled out in the test environment (and into production after some checks).

OpenShift is Kubernetes with useful bells and whistles that make OpenShift the natural choice as a DevOps platform: It pulls source codes from Git, automates the build process that you can customise through a powerful mechanism called source-to-image (https://github.com/openshift/source-to-image) or using Jenkins pipelines, manages deployment, and integrates with external tools through webhooks. 

It is, however, not without pitfalls.

Pitfall: Tooling

DevOps evolve on a dynamic landscape, where many contending technologies like Helm (https://helm.sh/), Ansible (https://www.ansible.com/), Terraform (https://www.terraform.io/)  have surfaced to cater to a fast growing community. Each of them has its own specialty, its weaknesses and strengths. All of them present a steep learning curve if you want to master them. How and what to choose? 

Do you as an organisation use only one technology and adapt it to all tasks, or do you allow the proliferation of tools each of them solving a specific problem? I recommend standardising the tooling, keeping the number of tools low to reduce maintenance costs like upgrades and training.

Consider the preferences of the developers and optimally, have them decide on the technology stack. The more developers prefer a technology, the more likely they are going to nurse their know-how (in their spare time). And obviously, a higher number of committed developers increases the bus factor. 

But regardless of what fancy software you intend to employ, there is one tool you can not dismiss: Bash

Bash is your build and deployment pipeline the first moment you set up your project. (Unless you use some tools integrated into your IDE, in which case, … to be filled with some appropriate curse.

Example: Management of OpenShift resources

Imagine, you want to deploy your application to OpenShift and thus set up the appropriate resources like BuildConfigs, DeploymentConfigs, ConfigMaps etc. Being an enthusiatic DevOps engineer, you write a simple Bash script to create and update the configurations from YAML-templates. 

Because Jenkins comes prepackaged with OpenShift, you also setup two simple pipelines, one for build and one for deployment. And since you prefer the ability to quickly hack the Groovy scripts in Jenkins’ web UI instead of make commits and pushes to Git, you insert it directly into the template definition, instead of pulling the Jenkinsfile from the repository. 

Soon, you realise the necessity of a test environment. Unfortunately, being an early adopter of OpenShift v3.5, you can’t parametrise your Jenkins pipeline to manage two different environments. That feature only becomes available in OpenShift v3.6 (https://docs.openshift.com/container-platform/3.6/dev_guide/builds/build_strategies.html#jenkins-pipeline-strategy-environment). Thus you expand your old-fashioned Bash script to insert external properties into your templates. 

Couple of months and four additional environments later, you have a setup of 6 separate OpenShift configurations and 8 Jenkins pipelines, one deployment pipeline for each environment and two build pipelines (one for the master branch, and one for the release branch). Your initially straight-forward Bash script has grown into a behemoth of over 150 lines. Even though v3.6 has become available in the meantime, you have invested so much in your proprietary workflow that you postphone updating the Jenkins pipelines. 

Everytime you want to make a change to the configurations, you have to apply your script several times to propagate the new state in all enviroments. And if you a vivid fan of microservices, you replicate the above setup for half a dozen applications.

On one hand, you become really good with Bash, out of necessity. On the other hand, it’s a living nightmare.  

Another critical problem you have not addressed is that consistency of the configurations is not guaranteed, if you make any modification in OpenShift’s web UI without corresponding updates in the templates. To solve this issue, you add a function to your Jenkins pipeline, which calls your Bash script and updates the OpenShift resources from Git before actual build or deployment. But how do you guarantee that your Jenkins pipeline is consistent with the version in the Git repository? 

Easy! You do that by writing a new pipeline which runs nightly and calls the Bash script that updates the Jenkins pipelines. So you end up with a Jenkins pipeline that uses Bash script to update other Jenkins pipelines which use Bash script to update OpenShift resources. Now you can pat yourself on the back for creating a monstrum akin to nested Russian dolls.

You may think of that as a hypothetical absurdity that I conjured up to make some unrealistic point. But I can sadly assure you that the above description is a simplified version of an over-engineered “solution” (to which I contributed).

Pitfall: Maintenance

So you wonder: What would best practises be in the above example? 

Simple: Two parametrised Jenkins pipelines for each application, sourced from Git repository, one for the build and one for the deployment. Branch or environment specific properties are configured through Jenkins parameters. And substitute Bash with functionalities provided by Jenkins plugins.

This obviously encounters new problems:

  • Parametrisation with OpenShift templates only generate text fields (no checkboxes or selection fields). The order of the parameters is random and can be outright confusing if you have more than a dozen configurable properties.
  • Jenkins is a famous plugin hell putting the OpenShift integration between a rock and a hard place. If you upgrade the plugins, you risk breaking the synchronisation, e.g. build logs gone missing for no apparant reasons, or you don’t update and face endless security warnings (and risks).  
  • There are two plugins for OpenShift, confusingly named jenkins-plugin (https://github.com/openshift/jenkins-plugin) and jenkins-client-plugin (https://github.com/openshift/jenkins-client-plugin). If you are an early adopter and implemented everything in the fomer, then the only option, you are required to rewrite everything in the latter, which will become the only option with v3.11.
  • The next generation of OpenShift will be soon available with a production-ready release v4.2, where everything will be slight / totally different. You have the choice of being stuck in v3.11 forever or dedicating a couple of weeks (or more realistically, months) to updating your know-how and the system.

My tips:

  • Upgrade regularly. Even if you have to overhaul major portions of your code, start better early than letting the technical debt grow beyond repair.   
  • Optimise for adaptability and maintanability, not performance! Continous delivery also means continuous adjustment of your workflow. Thus the time developers have to invest in modifying this greatly outweighs the benefits of any performance gains.

Pitfall: Process (OpenShift ≠ DevOps)

Regardless how great your OpenShift architecture is, it alone does not make you DevOps, which is inherently a process, not tooling. 

Many organisations try to implement DevOps by creating so-called DevOps teams that manage the infrastructure and the automated workflow, i.e. the build and deployment pipelines. This paradoxical approach often simply results in renaming the operations without significant changes in the processes, let alone being DevOps. 

You still have developers who have no clue about the additional files, e.g. Dockerfile and Jenkinsfile, that pop up in the repository. So if a manual build is required, they would often turn to a DevOps engineer to trigger the pipeline, even though they are privileged to do so. Understandably, developers are not used to messing up with operations. Any action can could directly cause the system to grind to a halt is daunting.

If you want to chip away this psychological obstacle, then make the pipelines fault-tolerant. If developers know that their action causes no harm to the system or is easily reversible, they are more likely to act.
One simple way is to make the default configuration of the pipeline harmless, e.g. no “DROP TABLE”. Thus any accidental click on the “Build” button can be shrugged off.

Conclusion

Obviously, OpenShift is not trivial. Get accustomed to it now by playing around with OKD (https://www.okd.io/). Running it in a production environment however will require the help of expert hands. If you consider the DevOps approach and run your applications on Openshift, then start as early as possible before old habits become entrenched and the migration prohibitively expensive.

You can also watch my talk at Technologieplauscherl on the topic (https://www.youtube.com/watch?v=cyzxOtycNH0).

// Autor

Boyang