Building a Simple Up/Down Status Dashboard for OpenShift

OpenShift provides a wealth of monitoring and alerts however sometimes it can be handy to surface a simple up/down signal for an OpenShift cluster that can be easily interpreted by tools like UptimeRobot. This enables you to provide an operational or business level dashboard of the status of your cluster to users and application owners that may not necessarily familiar with all of the nuances of OpenShift’s or Kubernete’s internals.

UptimeRobot Dashboard

The health status of an OpenShift cluster depends on many things such as etcd, operators, nodes, api, etc so how do we aggregate all of this information? While you could certaintly run your own code to do it, fortunately a cool utility called Cerberus already provides this capability. Cerebus was born out of Red Hat’s Performance and Scaling group and was designed to be used with Kraken, a chaos engineering tool. A chaos engineering tool isn’t very useful if you can’t determine the status of the cluster and thus Cereberus was born.

A number of blog posts have already been written about Kraken and Cerebus from a chaos engineering point of view which you can view here and here. Here we are going to focus on the basics of using it for simple health checking.

One thing to note about Cerberus is that it is aggresive about returning an unheathly state even if the cluster is operational. For example, if you set it to monitor a namespace any pod failures in a multi-pod deployment in that namespace will trigger an unhealthly flag even if the other pods in the deployment are running and still servicing requests. As a result, some tuning of Cerberus or the utilization of custom checks is required if you want to use it for a more SLA focused view.

To get started with Cerberus simply clone the git repo into an appropriate directory on the system where you want to run it. While you can run it inside of OpenShift, it is highly recommended to run it outside the cluster since your cluster monitoring tool should not be dependent on the cluster itself. To clone the repo, just run the following:

git clone https://github.com/cloud-bulldozer/cerberus

In order for Cerberus to run, it requires access to a kubeconfig file where a user has already been authenticated. For security purposes I would highly recommend using a serviceaccount with the cluster-reader role rather then using a user with cluster-admin. The commands below will create a serviceaccount in the openshift-monitoring namespace, bind it to the cluster-reader role and generate a kubeconfig that cerberus can use to authenticate to the cluster.

oc create sa cerberus -n openshift-monitoring
oc adm policy add-cluster-role-to-user cluster-reader -z cerberus -n openshift-monitoring
oc serviceaccounts create-kubeconfig cerberus -n openshift-monitoring > config/kubeconfig

Cerberus can automatically create a token for the prometheus-k8s service account to so it can access prometheus to pull metrics. To enable this we need to define a role to give the cerberus the necessary permissions and bind it to the cerberus service account. Create a file with the following content:

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: cerberus
  namespace: openshift-monitoring
rules:
  - apiGroups:
      - ""
    resources:
      - serviceaccounts
      - secrets
    verbs:
      - get
      - list
      - patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: cerberus-service-account-token
  namespace: openshift-monitoring  
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: cerberus
subjects:
  - kind: ServiceAccount
    name: cerberus
    namespace: openshift-monitoring

And then apply it with “oc apply -f” to the cluster.

To configure cerberus you can edit the existing config.yaml file in the repo or create a new one, creating a new one is highly recommended so if you do a git pull it doesn’t clobber your changes:

cp config/config.yaml config/my-config.yaml

Once you have the config file, you can go through the options and set what you need. Here is an example of my config file which is really just the example config with the kubeconfig parameter tweaked.

cerberus:
    distribution: openshift                              # Distribution can be kubernetes or openshift
    kubeconfig_path: /opt/cerberus/config/kubeconfig     # Path to kubeconfig
    port: 8080                                           # http server port where cerberus status is published
    watch_nodes: True                                    # Set to True for the cerberus to monitor the cluster nodes
    watch_cluster_operators: True                        # Set to True for cerberus to monitor cluster operators
    watch_url_routes:                                    # Route url's you want to monitor, this is a double array with the url and optional authorization parameter
    watch_namespaces:                                    # List of namespaces to be monitored
        -    openshift-etcd
        -    openshift-apiserver
        -    openshift-kube-apiserver
        -    openshift-monitoring
        -    openshift-kube-controller-manager
        -    openshift-machine-api
        -    openshift-kube-scheduler
        -    openshift-ingress
        -    openshift-sdn                               # When enabled, it will check for the cluster sdn and monitor that namespace
    cerberus_publish_status: True                        # When enabled, cerberus starts a light weight http server and publishes the status
    inspect_components: False                            # Enable it only when OpenShift client is supported to run
                                                         # When enabled, cerberus collects logs, events and metrics of failed components

    prometheus_url:                                      # The prometheus url/route is automatically obtained in case of OpenShift, please set it when the distribution is Kubernetes.
    prometheus_bearer_token:                             # The bearer token is automatically obtained in case of OpenShift, please set it when the distribution is Kubernetes. This is needed to authenticate with prometheus.
                                                         # This enables Cerberus to query prometheus and alert on observing high Kube API Server latencies. 

    slack_integration: False                             # When enabled, cerberus reports the failed iterations in the slack channel
                                                         # The following env vars needs to be set: SLACK_API_TOKEN ( Bot User OAuth Access Token ) and SLACK_CHANNEL ( channel to send notifications in case of failures )
                                                         # When slack_integration is enabled, a watcher can be assigned for each day. The watcher of the day is tagged while reporting failures in the slack channel. Values are slack member ID's.
    watcher_slack_ID:                                        # (NOTE: Defining the watcher id's is optional and when the watcher slack id's are not defined, the slack_team_alias tag is used if it is set else no tag is used while reporting failures in the slack channel.)
        Monday:
        Tuesday:
        Wednesday:
        Thursday:
        Friday:
        Saturday:
        Sunday:
    slack_team_alias:                                    # The slack team alias to be tagged while reporting failures in the slack channel when no watcher is assigned

    custom_checks:                                       # Relative paths of files conataining additional user defined checks

tunings:
    iterations: 5                                        # Iterations to loop before stopping the watch, it will be replaced with infinity when the daemon mode is enabled
    sleep_time: 60                                       # Sleep duration between each iteration
    kube_api_request_chunk_size: 250                     # Large requests will be broken into the specified chunk size to reduce the load on API server and improve responsiveness.
    daemon_mode: True                                    # Iterations are set to infinity which means that the cerberus will monitor the resources forever
    cores_usage_percentage: 0.5                          # Set the fraction of cores to be used for multiprocessing

database:
    database_path: /tmp/cerberus.db                      # Path where cerberus database needs to be stored
    reuse_database: False                                # When enabled, the database is reused to store the failures

At this time you can Cerberus manually and test it out as follows:

$ sudo python3 /opt/cerberus/start_cerberus.py --config /opt/cerberus/config/config-home.yaml
               _                         
  ___ ___ _ __| |__   ___ _ __ _   _ ___ 
 / __/ _ \ '__| '_ \ / _ \ '__| | | / __|
| (_|  __/ |  | |_) |  __/ |  | |_| \__ \
 \___\___|_|  |_.__/ \___|_|   \__,_|___/
                                         

2021-01-29 12:01:01,030 [INFO] Starting ceberus
2021-01-29 12:01:01,037 [INFO] Initializing client to talk to the Kubernetes cluster
2021-01-29 12:01:01,144 [INFO] Fetching cluster info
2021-01-29 12:01:01,260 [INFO] 
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.12    True        False         3d20h   Cluster version is 4.6.12

2021-01-29 12:01:01,365 [INFO] Kubernetes master is running at https://api.home.ocplab.com:6443

2021-01-29 12:01:01,365 [INFO] Publishing cerberus status at http://0.0.0.0:8080
2021-01-29 12:01:01,381 [INFO] Starting http server at http://0.0.0.0:8080

2021-01-29 12:01:01,623 [INFO] Daemon mode enabled, cerberus will monitor forever
2021-01-29 12:01:01,623 [INFO] Ignoring the iterations set

2021-01-29 12:01:01,955 [INFO] Iteration 1: Node status: True
2021-01-29 12:01:02,244 [INFO] Iteration 1: Cluster Operator status: True
2021-01-29 12:01:02,380 [INFO] Iteration 1: openshift-ingress: True
2021-01-29 12:01:02,392 [INFO] Iteration 1: openshift-apiserver: True
2021-01-29 12:01:02,396 [INFO] Iteration 1: openshift-sdn: True
2021-01-29 12:01:02,399 [INFO] Iteration 1: openshift-kube-scheduler: True
2021-01-29 12:01:02,400 [INFO] Iteration 1: openshift-machine-api: True
2021-01-29 12:01:02,406 [INFO] Iteration 1: openshift-kube-controller-manager: True
2021-01-29 12:01:02,425 [INFO] Iteration 1: openshift-etcd: True
2021-01-29 12:01:02,443 [INFO] Iteration 1: openshift-monitoring: True
2021-01-29 12:01:02,445 [INFO] Iteration 1: openshift-kube-apiserver: True
2021-01-29 12:01:02,446 [INFO] HTTP requests served: 0 

2021-01-29 12:01:02,446 [WARNING] Iteration 1: Masters without NoSchedule taint: ['home-jcn2d-master-0', 'home-jcn2d-master-1', 'home-jcn2d-master-2']

2021-01-29 12:01:02,592 [INFO] []

2021-01-29 12:01:02,592 [INFO] Sleeping for the specified duration: 60

Great, Cerberus is up and running now but wouldn’t be great if it would run automatically as a service? Let’s go ahead and set that up by creating a systemd service. First let’s setup a bash script called start.sh in the root of our cerberus directory as follows:

#!/bin/bash
 
echo "Starting Cerberus..."
 
python3 /opt/cerberus/start_cerberus.py --config /opt/cerberus/config/my-config.yaml

Next, create a systemd service at /etc/systemd/system/cerberus.service and add the following to it:

[Unit]
Description=Cerberus OpenShift Health Check

Wants=network.target
After=syslog.target network-online.target

[Service]
Type=simple
ExecStart=/bin/bash /opt/cerberus/start.sh
Restart=on-failure
RestartSec=10
KillMode=control-group

[Install]
WantedBy=multi-user.target

To have the service run cerberus use the following commands:

systemctl daemon-reload
systemctl enable cerberus.service
systemctl start cerberus.service

Check the status of the service after starting it, if the service failed you may need to delete the cerberus files in /tmp that were created when run manually previously. You can also check the endpoint at http://localhost:8080 to see the result it returns which is a simple text string with either “True” or “False”.

At this point we can then add our monitor to UptimeRobot assuming the Cerberus port is exposed to the internet. Below is an image of my monitor configuration:

And there you have it, you should start seeing the results in your status page as per the screenshot at the top of the page.

API Testing in OpenShift Pipelines with Newman

If you are writing REST based API applications you probably have some familiarity with the tool Postman which allows you to test your APIs via an interactive GUI. However did you know that there is a CLI equivalent of Postman called Newman that works with your existing Postman collections? Newman enables you to re-use your existing collections to integrate API testing into automated processes where a GUI would not be appropriate. While we will not go into the details of Postman or Newman here if you are new to the tools you can check out this blog which provides a good overview of both.

Integrating Newman into OpenShift Pipelines, aka Tekton, is very easy and straightforward. In this blog we are going to look at how I am using it in my product catalog demo to test the back-end API built in Quarkus as part of the CI/CD process powered by OpenShift Pipelines. This CI/CD process is shown in the diagram below (click for a bigger version) and note the two tasks where we do our API testing in the Development and Test environments, dev-test and test-test (unfortunate name) respectively. These tests are run after the new image is built and deployed in each environment and are thus considered integration tests rather then unit tests.

Product Catalog Server CICD

One of the things I love about Tekton, and thus OpenShift Pipelines, is the extensibility, it’s very easy to extend by creating custom images using either an existing image or an image that you have created yourself. If you are not familiar with OpenShift Pipelines or Tekton I would highly recommend checking out the concepts documentation which provides a good overview.

The first step to using Newman in OpenShift Pipelines is to create a custom task for it. Tasks in Tekton represent a sequence of steps to accomplish a specific goal or as the name implies, task. Each step uses the specified container image to perform it’s function. Fortunately in our case there is an existing container image for newman that we can leverage without having to create our own at docker.io/postman/newman. Our task definition for the newman task appears below:

apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
  name: newman
spec:
  params:
  - name: COLLECTION
    description: The collection to run, typically a remote URL pointing to the collection
    type: string
  - name: ENVIRONMENT
    description: The environment file to use from the newman-env configmap
    default: "newman-env.json"
  steps:
    - name: collections-test
      image: docker.io/postman/newman:latest
      command:
        - newman
      args:
        - run
        - $(inputs.params.COLLECTION)
        - -e
        - /config/$(inputs.params.ENVIRONMENT)
        - --bail
      volumeMounts:
        - name: newman-env
          mountPath: /config
  volumes:
    - name: newman-env
      configMap:
        name: newman-env

There are two parameters declared as part of this task, COLLECTION and ENVIRONMENT. The collection parameter references a URL to the test suite that you want to run, it’s typically created using the Postman GUI and exported as a JSON file. For the pipeline in the product catalog we use this product-catalog-server-tests.json. Each test in the collection represents a request/response to the API along with some simple tests to ensure conformance with the desired results.

For example, when requesting a list of products, we test that the response code was 200 and 12 products were returned as per the picture below:

Postman

Postman

The environment parameter is a configmap with the customization the test suite requires for the specific environment that is being tested. For example, the API for the development and test environments have different URLs so we need to parametize this so we can re-use the same test suite across all environments. You can see the environments for the dev and test in my github repo. The task is designed so that the configmap, newman-env, contains all of the environments as separate files within the configmap as per the example here:

apiVersion: v1
data:
  newman-dev-env.json: "{\n\t\"id\": \"30c331d4-e961-4606-aecb-5a60e8e15213\",\n\t\"name\": \"product-catalog-dev-service\",\n\t\"values\": [\n\t\t{\n\t\t\t\"key\": \"host\",\n\t\t\t\"value\": \"server.product-catalog-dev:8080\",\n\t\t\t\"enabled\": true\n\t\t},\n\t\t{\n\t\t\t\"key\": \"scheme\",\n\t\t\t\"value\": \"http\",\n\t\t\t\"enabled\": true\n\t\t}\n\t],\n\t\"_postman_variable_scope\": \"environment\"\n}"
  newman-test-env.json: "{\n\t\"id\": \"30c331d4-e961-4606-aecb-5a60e8e15213\",\n\t\"name\": \"product-catalog-dev-service\",\n\t\"values\": [\n\t\t{\n\t\t\t\"key\": \"host\",\n\t\t\t\"value\": \"server.product-catalog-test:8080\",\n\t\t\t\"enabled\": true\n\t\t},\n\t\t{\n\t\t\t\"key\": \"scheme\",\n\t\t\t\"value\": \"http\",\n\t\t\t\"enabled\": true\n\t\t}\n\t],\n\t\"_postman_variable_scope\": \"environment\"\n}"
kind: ConfigMap
metadata:
  name: newman-env
  namespace: product-catalog-cicd

In the raw configmap the environments are hard to read due to formatting, however below is what the newman-dev-env.json looks like when formatted properly. Notice the route is pointing to the service in the product-catalog-dev namespace.

{
	"id": "30c331d4-e961-4606-aecb-5a60e8e15213",
	"name": "product-catalog-dev-service",
	"values": [
		{
			"key": "host",
			"value": "server.product-catalog-dev:8080",
			"enabled": true
		},
		{
			"key": "scheme",
			"value": "http",
			"enabled": true
		}
	],
	"_postman_variable_scope": "environment"
}

So now that we have our task, our test suite and our environments we need to add the task to the pipeline to test an environment. You can see the complete pipeline here, an excerpt showing the pipeline testing the dev environment appears below:

    - name: dev-test
      taskRef:
        name: newman
      runAfter:
        - deploy-dev
      params:
        - name: COLLECTION
          value: https://raw.githubusercontent.com/gnunn-gitops/product-catalog-server/master/tests/product-catalog-server-tests.json
        - name: ENVIRONMENT
          value: newman-dev-env.json

When you run the task newman will log the results of the tests and if any of the tests fail will return an error code which propagated up to the pipeline and cause the pipeline itself to fail. Here is the result from testing the Dev environment:

newman
Quarkus Product Catalog
→ Get Products
GET http://server.product-catalog-dev:8080/api/product [200 OK, 3.63KB, 442ms]
✓ response is ok
✓ data valid
→ Get Existing Product
GET http://server.product-catalog-dev:8080/api/product/1 [200 OK, 388B, 14ms]
✓ response is ok
✓ Data is correct
→ Get Missing Product
GET http://server.product-catalog-dev:8080/api/product/99 [404 Not Found, 115B, 18ms]
✓ response is missing
→ Login
POST http://server.product-catalog-dev:8080/api/auth [200 OK, 165B, 145ms]
→ Get Missing User
GET http://server.product-catalog-dev:8080/api/user/8 [404 Not Found, 111B, 12ms]
✓ Is status code 404
→ Get Existing User
GET http://server.product-catalog-dev:8080/api/user/1 [200 OK, 238B, 20ms]
✓ response is ok
✓ Data is correct
→ Get Categories
GET http://server.product-catalog-dev:8080/api/category [200 OK, 458B, 16ms]
✓ response is ok
✓ data valid
→ Get Existing Category
GET http://server.product-catalog-dev:8080/api/category/1 [200 OK, 192B, 9ms]
✓ response is ok
✓ Data is correct
→ Get Missing Category
GET http://server.product-catalog-dev:8080/api/category/99 [404 Not Found, 116B, 9ms]
✓ response is missing
┌─────────────────────────┬───────────────────┬───────────────────┐
│ │ executed │ failed │
├─────────────────────────┼───────────────────┼───────────────────┤
│ iterations │ 10 │
├─────────────────────────┼───────────────────┼───────────────────┤
│ requests │ 90 │
├─────────────────────────┼───────────────────┼───────────────────┤
│ test-scripts │ 80 │
├─────────────────────────┼───────────────────┼───────────────────┤
│ prerequest-scripts │ 00 │
├─────────────────────────┼───────────────────┼───────────────────┤
│ assertions │ 130 │
├─────────────────────────┴───────────────────┴───────────────────┤
│ total run duration: 883ms │
├─────────────────────────────────────────────────────────────────┤
│ total data received: 4.72KB (approx) │
├─────────────────────────────────────────────────────────────────┤
│ average response time: 76ms [min: 9ms, max: 442ms, s.d.: 135ms] │
└─────────────────────────────────────────────────────────────────┘

So to summarize integrating API testing with OpenShift Pipelines is very quick and easy. While in this example we showed the process using Newman other API testing tools can be integrated following a similar process.

OpenShift User Application Monitoring and Grafana the GitOps way!

Update: All of the work outlined in this article is now available as a kustomize overlay in the Red Hat Canada GitOps repo here.

Traditionally in OpenShift, the cluster monitoring that was provided out-of-the-box (OOTB) was only available for cluster monitoring. Administrators could not configure it to support their own application workloads necessitating the deployment of a separate monitoring stack (typically community prometheus and grafana). However this has changed in OpenShift 4.6 as the cluster monitoring operator now supports deploying a separate prometheus instance for application workloads.

One great capability provided by the OpenShift cluster monitoring is that it deploys Thanos to aggregate metrics from both the cluster and application monitoring stacks thus providing a central point for queries. At this point in time you still need to deploy your own Grafana stack for visualizations but I expect a future version of OpenShift will support custom dashboards right in the console alongside the default ones. The monitoring stack architecture for OpenShift 4.6 is shown in the diagram (click for architecture documentation) below:

Monitoring Architecture

In this blog entry we cover deploying the user application monitoring feature (super easy) as well as a Grafana instance (not super easy) using GitOps, specifically in this case with ArgoCD. This blog post is going to assume some familiarity with Prometheus and Grafana and will concentrate on the more challenging aspects of using GitOps to deploy everything.

The first thing we need to do is deploy the user application monitoring in OpenShift, this would typically be done as part of your cluster configuration. To do this, as per the docs, we simply need to deploy the following configmap in the openshift-monitoring namespace:

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    enableUserWorkload: true

You can see this in my GitOps cluster-config here. Once deployed you should see the user monitoring components deployed in the openshift-user-workload-monitoring project as per below:

Now that the user monitoring is up and running we can configure the monitoring of our applications by adding the ServiceMonitor object to define the monitoring targets. This is typically done as part of the application deployment by application teams, it is a separate activity from the deployment of the user monitoring itself which is done in the cluster configurgation by cluster administrators. Here is an example that I have for my product-catalog demo that monitors my quarkus back-end:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: server
  namespace: product-catalog-dev
spec:
  endpoints:
  - path: /metrics
    port: http
    scheme: http
  selector:
    matchLabels:
      quarkus-prometheus: "true"

In the service monitor above, it defines that any kubernetes services, in the same namespace as the ServiceMonitor, which have the label quarkus-prometheus set to true will have their metrics collected on the port named ‘http’ using the path ‘/metrics’. Of course, your application needs to be enabled for prometheus metrics and most modern frameworks like quarkus make this easy. From a GitOps perspective deploying the ServiceMonitor is just another yaml to deploy along with the application as you can see in my product-catalog manifests here.

As an aside please note that the user monitoring in OpenShift does not support the namespace selector in ServiceMonitor for security reasons, as a result the ServiceMonitor must be deployed in the same namespace as the targets being defined. Thus if you have the same application in three different namespaces (say dev, test and prod) you will need to deploy the ServiceMonitor in each of those namespaces independently.

Now if I were to stop here it would hardly merit a blog post, however for most folks once they deploy the user monitoring the next step is deploying something to visualize them and in this example that will be Grafana. Deploying the Grafana operator via GitOps in OpenShift is somewhat involved since we will use the Operator Lifecycle Manager (OLM) to do it but OLM is asynchronous. Specifically, with OLM you push the Subscription and OperatorGroup and asynchronously OLM will install and deploy the operator. From a GitOps perspective managing the deployment of the operator and the Custom Resources (CR) becomes tricky since the CRs cannot be installed until the Operator Custom Resource Definitions (CRDs) are installed.

Fortunately in ArgoCD there are a number of features available to work around this, specifically adding the `argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true` annotation to our resources will instruct ArgoCD not to error out if some resources cannot be added initially. You can also combine this with retries in your ArgoCD application for more complex operators that take significant time to initialize, for Grafana though the annotation seems to be sufficient. In my product-catalog example, I am adding this annotation across all resources using kustomize:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

namespace: product-catalog-monitor

commonAnnotations:
    argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true

bases:
- https://github.com/redhat-canada-gitops/catalog/grafana-operator/overlays/aggregate?ref=grafana
- ../../../manifests/app/monitor/base

resources:
- namespace.yaml
- operator-group.yaml
- cluster-monitor-view-rb.yaml

patchesJson6902:
- target:
    version: v1
    group: rbac.authorization.k8s.io
    kind: ClusterRoleBinding
    name: grafana-proxy
  path: patch-proxy-namespace.yaml
- target:
    version: v1alpha1
    group: integreatly.org
    kind: Grafana
    name: grafana
  path: patch-grafana-sar.yaml

Now it’s beyond the scope of this blog to go into a detailed description of kustomize, but in a nutshell it’s a patching framework that enables you to aggregate resources from either local or remote bases as well as add new resources. In the kustomize file above, we are using the Red Hat Canada standard deployment of Grafana, which includes OpenShift OAuth integration, and combining it with my application specific monitoring Grafana resources such as Datasources and Dashboards which is what we will look at next.

Continuing along we need to setup the plumbing to connect Grafana to the cluster monitoring Thanos instance in the openshift-monitoring namespace. This blog article, Custom Grafana dashboards for Red Hat OpenShift Container Platform 4, does a great job of walking you through the process and I am not going to repeat it here, however please do read that article before carrying on.

The first step we need to do is define a GrafanaDatasource object:

apiVersion: integreatly.org/v1alpha1
kind: GrafanaDataSource
metadata:
  name: prometheus
spec:
  datasources:
    - access: proxy
      editable: true
      isDefault: true
      jsonData:
        httpHeaderName1: 'Authorization'
        timeInterval: 5s
        tlsSkipVerify: true
      name: Prometheus
      secureJsonData:
        httpHeaderValue1: 'Bearer ${BEARER_TOKEN}'
      type: prometheus
      url: 'https://thanos-querier.openshift-monitoring.svc.cluster.local:9091'
  name: prometheus.yaml

Notice in httpsHeaderValue1 we are expected to provide a bearer token, this token comes from the grafana-serviceaccount and can only be determined at runtime which makes it a bit of a challenge from a GitOps perspective. To manage this, we deploy a kubernetes job as an ArgoCD PostSync hook in order to patch the GrafanaDatasource with the appropriate token:


apiVersion: batch/v1
kind: Job
metadata:
  name: patch-grafana-ds
  annotations:
    argocd.argoproj.io/hook: PostSync
    argocd.argoproj.io/hook-delete-policy: HookSucceeded
spec:
  template:
    spec:
      containers:
        - image: registry.redhat.io/openshift4/ose-cli:v4.6
          command:
            - /bin/bash
            - -c
            - |
              set -e
              echo "Patching grafana datasource with token for authentication to prometheus"
              TOKEN=`oc serviceaccounts get-token grafana-serviceaccount -n product-catalog-monitor`
              oc patch grafanadatasource prometheus --type='json' -p='[{"op":"add","path":"/spec/datasources/0/secureJsonData/httpHeaderValue1","value":"Bearer '${TOKEN}'"}]'
          imagePullPolicy: Always
          name: patch-grafana-ds
      dnsPolicy: ClusterFirst
      restartPolicy: OnFailure
      serviceAccount: patch-grafana-ds-job
      serviceAccountName: patch-grafana-ds-job
      terminationGracePeriodSeconds: 30

This job runs using a special ServiceAccount which gives the job just enough access to retrieve the token and patch the datasource, once that’s done the job is deleted by ArgoCD.

The other thing we want to do is control access to Grafana, basically we want to grant OpenShift users who have view access on the Grafana route in the namespace access to grafana. The grafana operator uses the OpenShift OAuth Proxy to integrate with OpenShift. This proxy enables the definition of a Subject Access Review (SAR) to determine who is authorized to use Grafana, the SAR is simply a check on a particular object that acts as a way to determine access. For example, to only allow cluster administrators to have access to the Grafana instance we can specify that the user must have access to get namespaces:

-openshift-sar={"resource": "namespaces", "verb": "get"}

In our case we want anyone who has view access to the grafana route in the namespace grafana is hosted, product-catalog-monitor, to have access. So our SAR would appear as follows:

-openshift-sar={"namespace":"product-catalog-monitor","resource":"routes","name":"grafana-route","verb":"get"}

To make this easy for kustomize to patch, the Red Hat Canada grafana implementation passes the SAR as an environment variable. To patch the value we can include a kustomize patch as follows:

- op: replace
  path: /spec/containers/0/env/0/value
  value: '-openshift-sar={"namespace":"product-catalog-monitor","resource":"routes","name":"grafana-route","verb":"get"}'

You can see this patch being applied at the environment level in my product-catalog example here. In my GitOps standards, environments is where the namespace is created and thus it makes sense that any namespace patching that is required is done at this level.

After this it is simply a matter of including the other resources such as the cluster-monitor-view rolebinding to the grafana-serviceaccount so that grafana is authorized to retrieve the metrics.

If everything has gone well to this point you should be able to create a dashboard to view your application metrics.

Initializing Databases in OpenShift Deployment

When deploying a database in OpenShift there is typically a need to initialize the database with a schema and and perhaps an initial dataset or some reference data. This can be done in a variety of ways such as having the application initialize it, use a kubernetes job, etc. In OpenShift 3 with DeploymentConfig one technique that was quite common was to leverage the DeploymentConfig lifecycle post hook to initialize it, for example:

apiVersion: v1
kind: DeploymentConfig
  name: my-database
spec:
  strategy:
    recreateParams:
      post:
        execNewPod:
          command:
            - /bin/sh
            - '-c'
            - >-
              curl -o ~/php-react.sql
              https://raw.githubusercontent.com/gnunn1/openshift-basic-pipeline/master/react-crud/database/php-react.sql
              && /opt/rh/rh-mysql57/root/usr/bin/mysql -h $MYSQL_SERVICE_HOST -u
              $MYSQL_USER -D $MYSQL_DATABASE -p$MYSQL_PASSWORD -P 3306 <
              ~/php-react.sql
          containerName: ${DATABASE_SERVICE_NAME}
        failurePolicy: abort
...

While not necessarily a production grade technique, this was particularly useful for me when creating self-contained demos where I did not want to require someone to manually set up a bunch of infrastructure or provision datasets.

In OpenShift 4 there is a trend towards using the standard Deployment versus the OpenShift specific DeploymentConfig with most cases in the console and the cli defaulting to Deployments. While Deployments and DeploymentConfigs are very similar, there are some key differences in the capabilities between the two as outlined in the documentation. I won’t re-hash them all here, but one feature lacking in Deployments from DeploymentConfig is the lifcycle hook, so how do we accomplish the above using a Deployment?

For me, one technique I’ve found that works well is to leverage the s2i (source-2-image) capabilities of Red Hat’s database containers, however instead of building a custom container we can have s2i do our initializtion at runtime with the generic image. This works because you look at the assemble script the database containers are using for s2i, you can see all the assemble script is doing is copying the file from one location to another. The script itself doesn’t actually do any initialization at build time which means we could simply mount our initialization files directly in the image at the right location without building the container first.

You can see the technique in action in my product-catalog demo repository. In this repo it deploys, using kustomize, a MariaDB database which is then initialized with a schema and an initial dataset. To do this I have a configmap that contains a script, a DDL sql file to define the schema and a DML sql file to insert an initial dataset. Here is an abridged example:

kind: ConfigMap
apiVersion: v1
metadata:
  name: productdb-init
data:
  90-init-database.sh: |
    init_database() {
        local thisdir
        local init_data_file
        thisdir=$(dirname ${BASH_SOURCE[0]})
 
        init_data_file=$(readlink -f ${thisdir}/../mysql-data/schema.sql)
        log_info "Initializing the database schema from file ${init_data_file}..."
        mysql $mysql_flags ${MYSQL_DATABASE} < ${init_data_file}
 
        init_data_file=$(readlink -f ${thisdir}/../mysql-data/import.sql)
        log_info "Initializing the database data from file ${init_data_file}..."
        mysql $mysql_flags ${MYSQL_DATABASE} < ${init_data_file}
    }
 
    if ! [ -v MYSQL_RUNNING_AS_SLAVE ] && $MYSQL_DATADIR_FIRST_INIT ; then
        init_database
    fi
  import.sql: >
    INSERT INTO `categories` (`id`, `name`, `description`, `created`,
    `modified`) VALUES
    (1, 'Smartphone', 'Not a stupid phone', '2015-08-02 23:56:46', '2016-12-20
    06:51:25'),
    (2, 'Tablet', 'A small smartphone-laptop mix', '2015-08-02 23:56:46',
    '2016-12-20 06:51:42'),
    (3, 'Ultrabook', 'Ultra portable and powerful laptop', '2016-12-20
    13:51:15', '2016-12-20 06:51:52');
 
    INSERT INTO `products` (`id`, `name`, `description`, `price`, `category_id`,
    `created`, `modified`) VALUES
    (1, 'ASUS Zenbook 3', 'The most powerful and ultraportable Zenbook ever',
    1799, 3, '2016-12-20 13:53:00', '2016-12-20 06:53:00'),
    (2, 'Dell XPS 13', 'Super powerful and portable ultrabook with ultra thin
    bezel infinity display', 2199, 3, '2016-12-20 13:53:34', '2016-12-20
    06:53:34');

  schema.sql: >-
    DROP TABLE IF EXISTS `categories`;
 
    CREATE TABLE `categories` (
        `id` int(11) NOT NULL AUTO_INCREMENT,
        `created` date DEFAULT NULL,
        `description` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
        `modified` datetime(6) DEFAULT NULL,
        `name` varchar(128) COLLATE utf8mb4_unicode_ci NOT NULL,
        PRIMARY KEY (`id`)
    ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8mb4
    COLLATE=utf8mb4_unicode_ci;
 
    --
    -- Table structure for table `products`
    --
 
    DROP TABLE IF EXISTS `products`;
 
    CREATE TABLE `products` (
        `id` int(11) NOT NULL AUTO_INCREMENT,
        `created` date DEFAULT NULL,
        `description` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
        `modified` datetime(6) DEFAULT NULL,
        `name` varchar(128) COLLATE utf8mb4_unicode_ci NOT NULL,
        `price` double NOT NULL,
        `category_id` int(11) NOT NULL,
        PRIMARY KEY (`id`),
        KEY `FKog2rp4qthbtt2lfyhfo32lsw9` (`category_id`),
        CONSTRAINT `FKog2rp4qthbtt2lfyhfo32lsw9` FOREIGN KEY (`category_id`) REFERENCES `categories` (`id`)
    ) ENGINE=InnoDB AUTO_INCREMENT=13 DEFAULT CHARSET=utf8mb4
    COLLATE=utf8mb4_unicode_ci;
 
    --
    -- Table structure for table `users`
    --
 
    DROP TABLE IF EXISTS `users`;
 
    CREATE TABLE `users` (
        `id` int(11) NOT NULL AUTO_INCREMENT,
        `created_at` datetime(6) DEFAULT NULL,
        `email` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
        `iteration_count` int(11) DEFAULT NULL,
        `password_hash` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
        `salt` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
        PRIMARY KEY (`id`)
    ) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4
    COLLATE=utf8mb4_unicode_ci;

Now there are definitely some improvements that could be made here, for example the size of the configmap is limited so it could be better to load the DDL and DML files from git or other location rather then inlining them into the configmap.

Once you have the configmap then it’s simply a matter of mounting it at the appropriate location:


apiVersion: apps/v1
kind: Deployment
metadata:
  name: productdb
spec:
  template:
    spec:
      containers:
        - name: productdb
          image: registry.redhat.io/rhel8/mariadb-103:1
          ...
          volumeMounts:
          - mountPath: /var/lib/mysql/data
            name: productdb-data
          - mountPath: /opt/app-root/src/mysql-init/90-init-data.sh
            name: productdb-init
            subPath: 90-init-database.sh
          - mountPath: /opt/app-root/src/mysql-data/import.sql
            name: productdb-init
            subPath: import.sql
          - mountPath: /opt/app-root/src/mysql-data/schema.sql
            name: productdb-init
            subPath: schema.sql
      volumes:
      - configMap:
          name: productdb-init
        name: productdb-init
     ...

The complete Deployment example can be viewed here.

So that’s basically it, while I’ve only tested this with MariaDB I would expect the same technique would work with the MySQL and PostgreSQL databases images as well. As mentioned previously, I would not consider this a production ready technique but it is a useful tool when putting together examples or demos.

Updated GitOps Standards

I maintain a small document in Github outlining the GitOps standards I use in my own repositories. I find with kustomize it’s very important to have a standardized layout in terms of folder structure in an organization or else it becomes challenging for everyone to understand what kustomize is doing. A common frame of reference makes all the difference.

I’ve recently tweaked my standards, feel free to check them out at https://github.com/gnunn-gitops/standards. Comments always welcome as I’m very interested in learning what other folks are doing.

OpenShift Home Lab and Block Storage

I have a single server (Ryzen 3900x with 128 GB of RAM) homelab environment that I use to run OpenShift (plus it doubles as my gaming PC). The host is running Fedora 32 at the time of this writing and I run OpenShift on libvirt using a playbook created by Luis Javier Arizmendi Alonso that sets everything up including NFS storage. The NFS server runs on the host machine and the OpenShift nodes running in VMs access the NFS server via the host IP to provision PVs. Luis’s playbook sets up a dynamic NFS provisioner in OpenShift and it all works wonderfully.

However there are times when you do need block storage, while NFS is capable of handling some loads that would traditionally require block, small databases for example, I was having issues with some other more intensive workloads like Kafka. Fortunately I had a spare 500 GB SSD lying around from my retired gaming computer and I figured I could drop that into my homelab server and use as block storage. Hence began my journey of learning way more about iscsi then I ever wanted to know as a developer…

Here are the steps I used to get static block storage going, I’m definitely interested if there are better ways to do it particularly if someone has dynamic block storage going in libvirt then drop me a line! Note these instructions were written for Fedora 32 which is what my host is using.

The first step is we need to partition the SSD using LVM into chunks that we can eventually serve up as PVs in OpenShift. This process is pretty straightforward, first we need to create a physical volume and a volume group called ‘iscsi’. Note my SSD is on ‘/dev/sda’, your mileage will vary so replace the ‘/dev/sdX’ below with whatever device you are using. Be careful not to overwrite something that is in use.

pvcreate /dev/sdX
vgcreate iscsi /dev/sdX

Next we create a logical volume, I’ve opted to create a thin pool which means that storage doesn’t get allocated until it’s actually used. This allows you to over-provision storage if you need to though obviously some care is required. To create the thin pool run the following:

lvcreate -l 100%FREE -T iscsi/thin_pool

One we have our pool created we then need to create the actual volumes that will be available as PVs. I’ve chosen to create a mix of PV sizes as per below, feel free to vary depending on your use case. Having said that note the naming convention I am using which will flow up into our iscsi and PV configuration, I highly recommend you use a similar convention for consistency.

lvcreate -V 100G -T iscsi/thin_pool -n block0_100
lvcreate -V 100G -T iscsi/thin_pool -n block1_100
lvcreate -V 50G -T iscsi/thin_pool -n block2_50
lvcreate -V 50G -T iscsi/thin_pool -n block3_50
lvcreate -V 10G -T iscsi/thin_pool -n block4_10
lvcreate -V 10G -T iscsi/thin_pool -n block5_10
lvcreate -V 10G -T iscsi/thin_pool -n block6_10
lvcreate -V 10G -T iscsi/thin_pool -n block7_10
lvcreate -V 10G -T iscsi/thin_pool -n block8_10

Note if you make a mistake and want to remove a volume, you can do so by running the following command:

lvremove iscsi/block5_50

Next we need to install some iscsi packages onto the host in order to configure and run the iscsi daemon on the host.

dnf install iscsi-initiator-utils targetcli

I’ve opted to use targetcli to configure iscsi rather then hand bombing a bunch of files, it provides a nice cli interface over the process which for me, being an iscsi newbie, greatly appreciated. When you run targetcli it wil drop you into a prompt as follows:

[gnunn@lab-server ~]$ sudo targetcli
[sudo] password for gnunn: 
targetcli shell version 2.1.53
Copyright 2011-2013 by Datera, Inc and others.
For help on commands, type 'help'.

/> 

The prompt basically follows standard linux file system conventions and you can use commands like ‘cd’ and ‘ls’ to navigate it. The first thing we are going to do is create our block devices which map to our LVM PVs. In the targetcli prompt this is done with the following commands, note the naming convention being used which ties these devices to our PVs:

cd backstores/block
create dev=/dev/mapper/iscsi-block0_100 name=disk0-100
create dev=/dev/mapper/iscsi-block1_100 name=disk1-100
create dev=/dev/mapper/iscsi-block2_50 name=disk2-50
create dev=/dev/mapper/iscsi-block3_50 name=disk3-50
create dev=/dev/mapper/iscsi-block4_10 name=disk4-10
create dev=/dev/mapper/iscsi-block5_10 name=disk5-10
create dev=/dev/mapper/iscsi-block6_10 name=disk6-10
create dev=/dev/mapper/iscsi-block7_10 name=disk7-10
create dev=/dev/mapper/iscsi-block8_10 name=disk8-10

Next we create the initiator in iscsi, note that my host name is lab-server so I used that in the name below, feel free to modify as you prefer. I’ll admit I’m still a little fuzzy on iscsi naming conventions so suggestions welcome from those of you with more experience.

cd /iscsi
create iqn.2003-01.org.linux-iscsi.lab-server:openshift

Next we create the luns, the luns map to our block devices and represent the storage that will be available:

cd /iscsi/iqn.2003-01.org.linux-iscsi.lab-server:openshift/tpg1/luns
create storage_object=/backstores/block/disk0-100
create storage_object=/backstores/block/disk1-100
create storage_object=/backstores/block/disk2-50
create storage_object=/backstores/block/disk3-50
create storage_object=/backstores/block/disk4-10
create storage_object=/backstores/block/disk5-10
create storage_object=/backstores/block/disk6-10
create storage_object=/backstores/block/disk7-10
create storage_object=/backstores/block/disk8-10

Next we create the acls which control access to the luns. Note in my case my lab server is running on a private network behind a firewall so I have not bothered with any sort of authentication. If this is not the case for you then I would definitely recommend spending some time looking into adding this.

cd /iscsi/iqn.2003-01.org.linux-iscsi.lab-server:openshift/tpg1/acls
create iqn.2003-01.org.linux-iscsi.lab-server:client
create iqn.2003-01.org.linux-iscsi.lab-server:openshift-client

Note I’ve created two acls, one as a generic client and one specific for my openshift cluster.

Finally the last step is the portal, a default portal will be created that binds to all ports on 0.0.0.0. My preference is to remove it and bind it to a specific IP address on the host. My host as two ethernet ports so here I am binding it to the 2.5 gigabit port which has a static IP address, your IP address will obviously vary.

cd /iscsi/iqn.2003-01.org.linux-iscsi.lab-server:openshift/tpg1/portal
delete 0.0.0.0 ip_port=3260
create 192.168.1.83

Once you have done all this, you should have a result that looks similar to the following when you run ‘ls /’ in targetcli:

o- / ......................................................................................................................... [...]
  o- backstores .............................................................................................................. [...]
  | o- block .................................................................................................. [Storage Objects: 9]
  | | o- disk0-100 .................................................. [/dev/mapper/iscsi-block0_100 (100.0GiB) write-thru activated]
  | | | o- alua ................................................................................................... [ALUA Groups: 1]
  | | |   o- default_tg_pt_gp ....................................................................... [ALUA state: Active/optimized]
  | | o- disk1-100 .................................................. [/dev/mapper/iscsi-block1_100 (100.0GiB) write-thru activated]
  | | | o- alua ................................................................................................... [ALUA Groups: 1]
  | | |   o- default_tg_pt_gp ....................................................................... [ALUA state: Active/optimized]
  | | o- disk2-50 ..................................................... [/dev/mapper/iscsi-block2_50 (50.0GiB) write-thru activated]
  | | | o- alua ................................................................................................... [ALUA Groups: 1]
  | | |   o- default_tg_pt_gp ....................................................................... [ALUA state: Active/optimized]
  | | o- disk3-50 ..................................................... [/dev/mapper/iscsi-block3_50 (50.0GiB) write-thru activated]
  | | | o- alua ................................................................................................... [ALUA Groups: 1]
  | | |   o- default_tg_pt_gp ....................................................................... [ALUA state: Active/optimized]
  | | o- disk4-10 ..................................................... [/dev/mapper/iscsi-block4_10 (10.0GiB) write-thru activated]
  | | | o- alua ................................................................................................... [ALUA Groups: 1]
  | | |   o- default_tg_pt_gp ....................................................................... [ALUA state: Active/optimized]
  | | o- disk5-10 ..................................................... [/dev/mapper/iscsi-block5_10 (10.0GiB) write-thru activated]
  | | | o- alua ................................................................................................... [ALUA Groups: 1]
  | | |   o- default_tg_pt_gp ....................................................................... [ALUA state: Active/optimized]
  | | o- disk6-10 ..................................................... [/dev/mapper/iscsi-block6_10 (10.0GiB) write-thru activated]
  | | | o- alua ................................................................................................... [ALUA Groups: 1]
  | | |   o- default_tg_pt_gp ....................................................................... [ALUA state: Active/optimized]
  | | o- disk7-10 ..................................................... [/dev/mapper/iscsi-block7_10 (10.0GiB) write-thru activated]
  | | | o- alua ................................................................................................... [ALUA Groups: 1]
  | | |   o- default_tg_pt_gp ....................................................................... [ALUA state: Active/optimized]
  | | o- disk8-10 ..................................................... [/dev/mapper/iscsi-block8_10 (10.0GiB) write-thru activated]
  | |   o- alua ................................................................................................... [ALUA Groups: 1]
  | |     o- default_tg_pt_gp ....................................................................... [ALUA state: Active/optimized]
  | o- fileio ................................................................................................. [Storage Objects: 0]
  | o- pscsi .................................................................................................. [Storage Objects: 0]
  | o- ramdisk ................................................................................................ [Storage Objects: 0]
  o- iscsi ............................................................................................................ [Targets: 1]
  | o- iqn.2003-01.org.linux-iscsi.lab-server:openshift .................................................................. [TPGs: 1]
  |   o- tpg1 ............................................................................................... [no-gen-acls, no-auth]
  |     o- acls .......................................................................................................... [ACLs: 2]
  |     | o- iqn.2003-01.org.linux-iscsi.lab-server:client ........................................................ [Mapped LUNs: 9]
  |     | | o- mapped_lun0 ............................................................................. [lun0 block/disk0-100 (rw)]
  |     | | o- mapped_lun1 ............................................................................. [lun1 block/disk1-100 (rw)]
  |     | | o- mapped_lun2 .............................................................................. [lun2 block/disk2-50 (rw)]
  |     | | o- mapped_lun3 .............................................................................. [lun3 block/disk3-50 (rw)]
  |     | | o- mapped_lun4 .............................................................................. [lun4 block/disk4-10 (rw)]
  |     | | o- mapped_lun5 .............................................................................. [lun5 block/disk5-10 (rw)]
  |     | | o- mapped_lun6 .............................................................................. [lun6 block/disk6-10 (rw)]
  |     | | o- mapped_lun7 .............................................................................. [lun7 block/disk7-10 (rw)]
  |     | | o- mapped_lun8 .............................................................................. [lun8 block/disk8-10 (rw)]
  |     | o- iqn.2003-01.org.linux-iscsi.lab-server:openshift-client .............................................. [Mapped LUNs: 9]
  |     |   o- mapped_lun0 ............................................................................. [lun0 block/disk0-100 (rw)]
  |     |   o- mapped_lun1 ............................................................................. [lun1 block/disk1-100 (rw)]
  |     |   o- mapped_lun2 .............................................................................. [lun2 block/disk2-50 (rw)]
  |     |   o- mapped_lun3 .............................................................................. [lun3 block/disk3-50 (rw)]
  |     |   o- mapped_lun4 .............................................................................. [lun4 block/disk4-10 (rw)]
  |     |   o- mapped_lun5 .............................................................................. [lun5 block/disk5-10 (rw)]
  |     |   o- mapped_lun6 .............................................................................. [lun6 block/disk6-10 (rw)]
  |     |   o- mapped_lun7 .............................................................................. [lun7 block/disk7-10 (rw)]
  |     |   o- mapped_lun8 .............................................................................. [lun8 block/disk8-10 (rw)]
  |     o- luns .......................................................................................................... [LUNs: 9]
  |     | o- lun0 .............................................. [block/disk0-100 (/dev/mapper/iscsi-block0_100) (default_tg_pt_gp)]
  |     | o- lun1 .............................................. [block/disk1-100 (/dev/mapper/iscsi-block1_100) (default_tg_pt_gp)]
  |     | o- lun2 ................................................ [block/disk2-50 (/dev/mapper/iscsi-block2_50) (default_tg_pt_gp)]
  |     | o- lun3 ................................................ [block/disk3-50 (/dev/mapper/iscsi-block3_50) (default_tg_pt_gp)]
  |     | o- lun4 ................................................ [block/disk4-10 (/dev/mapper/iscsi-block4_10) (default_tg_pt_gp)]
  |     | o- lun5 ................................................ [block/disk5-10 (/dev/mapper/iscsi-block5_10) (default_tg_pt_gp)]
  |     | o- lun6 ................................................ [block/disk6-10 (/dev/mapper/iscsi-block6_10) (default_tg_pt_gp)]
  |     | o- lun7 ................................................ [block/disk7-10 (/dev/mapper/iscsi-block7_10) (default_tg_pt_gp)]
  |     | o- lun8 ................................................ [block/disk8-10 (/dev/mapper/iscsi-block8_10) (default_tg_pt_gp)]
  |     o- portals .................................................................................................... [Portals: 1]
  |       o- 192.168.1.83:3260 ................................................................................................ [OK]
  o- loopback ......................................................................................................... [Targets: 0]
  o- vhost ............................................................................................................ [Targets: 0]

At this point you can exit targetcli by typing ‘exit’ in the prompt. Next at this point we need to expose the iscsi port in firewalld and enable the services:

firewall-cmd --add-service=iscsi-target --permanent
firewall-cmd --reload
systemctl enable iscsid
systemctl start iscsid
systemctl enable target
systemctl start target

Note that the target service ensures the configuration you created in targetcli is restored whenever the host is restarted. If you do not enable and start the target the next time the computer starts you will notice an empty configuration.

Now that the host is created we can go ahead and create the static PVs for OpenShift as well as the non-provisioning storage class. You can view the PVs I’m using in git here, I won’t paste them in the blog since it’s a long file. We wrap these PVs in a non-provisioning storage class so we can request them easily on demand from our applications.

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: iscsi
provisioner: no-provisioning
parameters:

To test out the PVs, here is an example PVS:

apiVersion: "v1"
kind: "PersistentVolumeClaim"
metadata:
  name: "block"
spec:
  accessModes:
    - "ReadWriteOnce"
  resources:
    requests:
      storage: "100Gi"
  storageClassName: "iscsi"

And that’s it, now you have access to block storage in your homelab environment. I’ve used this quite a bit with kafka and it works great, I’m looking into doing some benchmarking of this versus AWS EBS to see how the performance compares and will follow up on this in another blog.

Sending Alerts to Slack in OpenShift

In OpenShift 4 it’s pretty easy to configure sending alerts to a variety of destinations including Slack. While my work tends to be more focused on the development side of the house then operations, my goal for my homelab cluster is to be as production-like as possible hence the need to configure receivers.

To send messages to Slack the first thing you will need to do is setup a slack organization if you do not already have one and then setup channels to receive the alerts. In my case I opted to create three channels: alerts-critical, alerts-default and alerts-watchdog. These channels mirror the default filtering used in OpenShift 4, of course you may want to adjust as necessary based on whatever filtering routes and rules you have in place.

Once you have your channels in place for receiving alerts, you can add incoming webhooks by following the documentation here. Each webhook you create in Slack will have a URL in the following format:

https://hooks.slack.com/services/XXXXX/XXXXX/XXXXX

The XXXXX will of course be replaced by something specific to your channel. Please note that Slack webhooks do not require authentication so you should not expose these URLs in public git repositories as it may lead to your channels getting spammed.

Once you have that, you need to configure the default alertmanager-main secret. The OpenShift documentation does a good job of explaining the process both from a GUI and a yaml perspective, in my case I prefer using yaml since I am using a GitOps approach to manage it.

An example of my Slack receivers configuration is below with the complete example on github:

  - name: Critical
    slack_configs:
    - send_resolved: false
      api_url: https://hooks.slack.com/services/XXXXXX/XXXXXX/XXXXXX
      channel: alerts-critical
      username: '{{ template "slack.default.username" . }}'
      color: '{{ if eq .Status "firing" }}danger{{ else }}good{{ end }}'
      title: '{{ template "slack.default.title" . }}'
      title_link: '{{ template "slack.default.titlelink" . }}'
      pretext: '{{ .CommonAnnotations.summary }}'
      text: |-
        {{ range .Alerts }}
          *Alert:* {{ .Labels.alertname }} - `{{ .Labels.severity }}`
          *Description:* {{ .Annotations.message }}
          *Started:* {{ .StartsAt }}
          *Details:*
          {{ range .Labels.SortedPairs }} • *{{ .Name }}:* `{{ .Value }}`
          {{ end }}
        {{ end }}
      fallback: '{{ template "slack.default.fallback" . }}'
      icon_emoji: '{{ template "slack.default.iconemoji" . }}'
      icon_url: '{{ template "slack.default.iconurl" . }}'

This configuration is largely adapted from this excellent blog post by Hart Hoover called Pretty AlertManager Alerts in Slack. With this configuration, the alerts appear as follows in Slack:

As mentioned previously, the slack webhook should be treated as sensitive and as a result I’m using Sealed Secrets to encrypt the secret in my git repo which is then applied by ArgoCD as part of my GitOps process to configure the cluster. As a security measure, in order for Sealed Secrets to overwrite the existing alertmanager-main secret you need to prove you own the secret by placing an annotation on it. I’m using a pre-sync hook to do that in ArgoCD via a job:


apiVersion: batch/v1
kind: Job
metadata:
  name: annotate-secret-job
  namespace: openshift-monitoring
  annotations:
    argocd.argoproj.io/hook: PreSync
spec:
  template:
    spec:
      containers:
        - image: registry.redhat.io/openshift4/ose-cli:v4.6
          command:
            - /bin/bash
            - -c
            - |
              oc annotate secret alertmanager-main sealedsecrets.bitnami.com/managed="true" --overwrite=true
          imagePullPolicy: Always
          name: annotate-secret
      dnsPolicy: ClusterFirst
      restartPolicy: OnFailure
      serviceAccount: annotate-secret-job
      serviceAccountName: annotate-secret-job
      terminationGracePeriodSeconds: 30

The complete alertmanager implementation is part of my cluster-config repo which I will be covering more in a subsequent blog post.

Empowering Developers in the OpenShift Console

For customers coming from OpenShift 3, one thing that gets noticed right away is the change in consoles. While the Administrator perspective is analgous to the cluster console in 3.11, what happened to the default console which was the bread and butter experience for developers?

The good news is that in OpenShift 4 there is a new Developer perspective which provides an alternative experience tailored specifically for Developers out of a unified console. The Developer perspective features include providing a topology view giving an at-a-glance overview of applications in the namespace as well as the ability to quickly add new applications from a variety of sources such as git, container images, templates, helm charts, operators and more.

In this blog we will examine some of these new features and discuss how you can get the most out of the capabilities available in the Developer perspective. While the OpenShift documentation does cover many of these and I will link to the docs when needed, I think it’s worthwhile to review them in a concise form in order to understand the art of the possible with respect to empowering Developers in the OpenShift console.

Many of these features can be accessed by regular users, however some do require cluster-admin rights to use and are intended for a cluster administrator to provision on behalf of their developer community. Cluster administrators can choose the features that make sense for their developers providing an optimal experience based on their organizations requirements.

Labels and Annotations in the Topology View

The topology view provides an overview of the application, it enables users to understand the composition of the application at a glance by depicting the component resource object (Deployment, DeploymentConfig, StatefulSet, etc), component health, the runtime used, relationships to other resources and more.

The OpenShift Documentation on topology goes into great detail on this view however it focuses on using it from a GUI perspective and only mentions anecdotally at the end how how it is powered. Thus I would like to cover this in more detail since in many cases our manifests are stored and managed in git repos rather then in the console itself.

In short, how the topology view is rendered is determined by the labels and annotations on your resource objects. The labels and annotations that are available are defined in this git repo here. These annotations and labels, which are applied to the three Deployments, DeployConfigs, etc, are a mix of recommended kubernetes labels (https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels) as well as new OpenShift recommended labels and annotations to drive the topology view.

An example of these labels and annotations can be seen in the following diagram:

using this image as an example, we can see the client component is using the Node.js runtime and makes calls to the server component. The metadata for the client Deployment that causes this to be rendered in this way is as follows:

metadata:
  name: client
  annotations:
    app.openshift.io/connects-to: server
    app.openshift.io/vcs-ref: master
    app.openshift.io/vcs-uri: 'https://github.com/gnunn-gitops/product-catalog-client'
  labels:
    app: client
    app.kubernetes.io/name: client
    app.kubernetes.io/component: frontend
    app.kubernetes.io/instance: client
    app.openshift.io/runtime: nodejs
    app.kubernetes.io/part-of: product-catalog

The key labels and annotations that are being used in this example are as follows:

Type Name Description
Label app.kubernetes.io/part-of This refers to the overall application that this component is a part-of. In the image above, this is the ‘product-catalog’ bounding box which encapsulates the database, client and server components
Label app.kubernetes.io/name This is the name of the component, in the image above it corresponds to “database”, “server” and “client”
Label app.kubernetes.io/component The role of the component, i.e. frontend, backend, database, etc
Label app.kubernetes.io/instance This refers to the instance of the component, in the above simple example I have set the instance to be the same as the name but that’s not required. The instance label is used by the connect-to annotation to render the line arrows to depict the relationship between different components.
Label app.openshift.io/runtime This is the runtime used by the component, i.e. Java, NodeJS, Quarkus, etc. The topology view uses this to render the icon. A list of icons that are available in OpenShift can be found in github in the OpenShift console repo in the catalog-item-icon.tsx file. Note that you should select the branch that matches your OpenShift version, i.e. the “release-4.5” branch for OCP 4.5.
Annotation app.openshift.io/connects-to Renders the directional line showing the relationship between components. This is set to the instance label of the component for which you want to show the relationship.
Annotation app.openshift.io/vcs-uri This is the git repo where the source code for the application is located. By default this will add a link to the circle that can be clicked to navigate to the git repo. However if CodeReady Workspaces is installed (included for free in OpenShift), this will create a link to CRW to open the code in a workspace. If the git repo has a devfile.yaml in the root of the repository, the devfile will be used to create the workspace.

The example image above shows the link to CRW in the bottom right corner.

Annotation app.openshift.io/vcs-ref This is the reference to the version of source code used for the component. It can be a branch, tag or commit SHA.

A complete list of all of the labels and annotations can be found here.

Pinning Common Searches

In the Developer perspective the view is deliberately simplified from the Administrator perspective to focus specifically on the needs of the Developer. However predicting those needs is always difficult and as a result it’s not uncommon for users to need to find additional Resources.

To enable this, the Developer perspective provides the Search capability which enables you to find any Resource in OpenShift quickly and easily. As per the image below, highlighted in red, it also has a feature tucked away in the upper right side called “Add to Navigation”, if you click that your search gets added to the menubar on the left.

This is great for items you may commonly look for, instead of having to repeat the search over and over you can just bookmark it into the UI. Essentially once you click that button, the search, in this case for Persistent Volume Claims, will appear in the bar on the left as per below.

CodeReady Workspaces

CodeReady Workspaces (CRW) is included in OpenShift and it provides an IDE in a browser, I typically describe it as “Visual Studio Code on the web”. While it’s easy to install, the installation is done via an operator so it does require a cluster administrator to make it available.

The real power of CRW in my opinion is the ability to have the complete stack with all of the tools and technologies needed to work effectively on the application. No longer does the developer need to spend days setting up his laptop, instead simply create a devfile.yaml in the root of your git repository and it will configure CRW to use the stack appropriate for the application. Clicking on the CodeReady Workspaces icon in the Developer Topology view will open up a workspace with everything ready to go based on the devfile.yaml in the repo.

In short, one click takes you from this:

To this:

In my consulting days setting up my workstation for the application I was working on was often the bane of my existence involving following some hand-written and often outdated instructions, this would have made my life so much easier.

Now it should noted that running an IDE in OpenShift does require additional compute resources on the cluster, however personally I feel the benefits of this tool make it a worthwhile trade-off.

The OpenShift documentation does a great job of covering this feature so have a look there for detailed information.

Adding your own Helm Charts

In OpenShift 4.6 a new feature has been added which permits you to add your organization’s Helm Charts to the developer console through the use of the HelmChartRepository object. This enables developers using the platform to access the Helm Chart through the Developer Console and quickly instantiate the chart using a GUI driven approach.

Unfortunately, unlike OpenShift templates which can be added to the cluster globally or to specific namespaces, the HelmChartRepository object is cluster scoped only and does require a cluster administrator to use. As a result this feature is currently intended to be used by the cluster administrators to provide a curated set of helm charts for the platform user base as a whole.

An example HelmChartRepository is shown below:

apiVersion: helm.openshift.io/v1beta1
kind: HelmChartRepository
metadata:
  name: demo-helm-charts
spec:
  connectionConfig:
    url: 'https://gnunn-gitops.github.io/helm-charts'
  name: Demo Helm Charts

When this is added to an OpenShift cluster, the single chart in that repo, Product Catalog, appears in the Developer Console as per below and can be instantiated by developers as needed. The console will automatically display the latest version of that chart.

If you add a json schema (values.schema.json) to your Helm chart, as per this example, the OpenShift console can render a form in the GUI for users to fill out without having to directly deal with yaml.

If you are looking for a tutorial on how to create a Helm repo, I found the one here, “Create a public Helm chart repository with GitHub Pages”, quite good.

Adding Links to the Console

In many organizations it’s quite common to have a broad eco-system surrounding your own OpenShift cluster such as wikis, enterprise registries, third-party tools, etc to support your platform users. The OpenShift console enables a cluster administrator to add additional links to the console in various parts of the user interface to make it easy for your users to discover and navigate to the additional information and tools.

The available locations for ConsoleLink include:

  • ApplicationMenu – Places the item in the application menu as per the image below. In this image we have custom ConsoleLink items for ArgoCD (GitOps tool) and Quay (Enterprise Registry).

  • HelpMenu – Places the item in the OpenShift help menu (aka the question mark). In the image below we have a ConsoleLink that takes us to the ArgoCD documentation.

  • UserMenu – Inserts the link into the User menu which is in the top right hand side of the OpenShift console.
  • NamespaceDashboard – Inserts the link into the Project dashboard for the selected namespaces.

A great blog entry that covered console links, as well as other console customizations, can be found on the OpenShift Blog.

Web Terminal

This one is a little more bleeding edge as this is currently in Technical Preview, however a newer feature in OpenShift is the ability to integrate a web terminal into the OpenShift console. This enables developers to bring up a CLI whenever they need it without having to have the oc binary on hand. The terminal is automatically logged into OpenShift using the same user that is in the console.

The Web Terminal installs as an operator into the OpenShift cluster so again requiring a cluster admin to install it. Creating an instance of the web terminal once the operator is installed is easily done through the use of a CR:

apiVersion: workspace.devfile.io/v1alpha1
kind: DevWorkspace
metadata:
  name: web-terminal
  labels:
    console.openshift.io/terminal: 'true'
  annotations:
    controller.devfile.io/restricted-access: 'true'
  namespace: openshift-operators
spec:
  routingClass: web-terminal
  started: true
  template:
    components:
      - plugin:
          id: redhat-developer/web-terminal/4.5.0
          name: web-terminal

While this is Technical Preview and not recommended for Production usage, it is something to keep an eye on as it moves towards GA.

Building Scala and SBT Applications on OpenShift

In this article we will look at how to build Scala applications in OpenShift. While Scala is a Java Virtual Machine (JVM) language, applications written in Scala typically use a build tool called SBT (https://www.scala-sbt.org) which is not currently supported by OpenShift. However the great thing about OpenShift is that it is extensible and adding support for Scala applications is very straightforward as will be shown in this article.

As a quick level set, when building and deploying applications for OpenShift there are typically three options available as follows:

  • Source-To-Image. This is a capability available in OpenShift via builder images that enable developers to simply point a builder to a git repository and have the application compiled with an image created automatically by the platform.
  • Jenkins Pipeline. In this scenario we leverage Jenkins to create a pipeline for our application and use an appropriate build agent (i.e. slave) for our technology. Out of the box OpenShift includes agents for Maven and Nodejs but not Scala/SBT.
  • Build the container image outside of OpenShift and push the resulting image into the internal OpenShift registry. This is essentially treating the platform as a Container-As-A-Service (CaaS) rather then leveraging the Platform-As-A-Service (PaaS) capabilities available in OpenShift.

All of the above approaches are perfectly valid and the choice enterprises make in this regard are typically driven by a variety of factors including technology used, toolset and organizational requirements.

For the purposes of this article I am going to focus on building Scala applications using the second option, Jenkins Pipelines which is typically my preferred approach for production applications. Note that this option is not exclusive of Option 1, you can certainly use S2I in a Jenkins pipeline to build your application from source. However I do prefer using a build agent simply because there are a variety of activities (code scanning, pushing to repository, pre-processing, etc) that need to be performed as part of the build process that can often be application specific and a build agent simply gives us more flexibility in this regard.

Additionally I typically recommend to my customers to use Red Hat’s JDK image whenever possible in order to benefit from the support that is provided as part of an OpenShift (or RHEL) subscription. With a build agent this is a natural activity as part of the pipeline, with S2I you would need to use build chaining which isn’t quite as straightforward. Having said that, if you are interested in the S2I approach an example of creating an S2I enabled image is available here and it works well.

Creating the build agent

In OpenShift the provided Jenkins image includes the Jenkins kubernetes-plugin. This plugin enables Jenkins to spin up build agents in the Kubernetes environment as needed to support builds. Separate and distinct build agents are typically used to support different technologies and build tools. As mentioned previously, OpenShift includes two build agents, Maven and Node.js, but does not provide a build agent for Scala or SBT.

Fortunately creating our own build agent is quite easy to do since OpenShift provides a base image to inherit from. If using the open source version of OpenShift, OKD, you can use the base image openshift/jenkins-slave-base-centos7 whereas if using the enterprise version of OpenShift I would recommend using the rhel7 version openshift/jenkins-slave-base-rhel7.

In this example we will use the centos7 version of the image. To create the agent image we need to create a Dockerfile defining the image, our example uses the Dockerfile below. This Dockerfile is also available in the github repo sbt-slave.

FROM openshift/jenkins-slave-base-centos7
MAINTAINER Gerald Nunn <gnunn@redhat.com>
 
ENV SBT_VERSION 1.2.6
ENV SCALA_VERSION 2.12.7
ENV IVY_DIR=/var/cache/.ivy2
ENV SBT_DIR=/var/cache/.sbt
 
USER root
 
RUN INSTALL_PKGS="sbt-$SBT_VERSION" \
 && curl -s https://bintray.com/sbt/rpm/rpm > bintray-sbt-rpm.repo \
 && mv bintray-sbt-rpm.repo /etc/yum.repos.d/ \
 && yum install -y --enablerepo=centosplus $INSTALL_PKGS \
 && rpm -V $INSTALL_PKGS \
 && yum install -y https://downloads.lightbend.com/scala/$SCALA_VERSION/scala-$SCALA_VERSION.rpm \
 && yum clean all -y
 
USER 1001

Notice that this image defines environment variables for the Scala and SBT versions. While the image is built with specific versions pre-loaded, SBT will automatically use the version specified in your build if it differs. To build the image, use the following command:

docker build . -t jenkins-slave-sbt-centos7

After building the image, you can view the image as follows:

$ docker images | grep jenkins-slave-sbt-centos7
 
jenkins-slave-sbt-centos7                                          latest              d2a60298e18d        6 weeks ago         1.23GB

Next we need to make the image available to OpenShift, one option is to directly push it into your OpenShift registry. Another option is to push it into an external registry which is what we will do here by pushing the image into Docker Hub. Note you will need an account on Docker Hub in order to do this. The first step to push the image into Docker Hub (or other registry) is to tag the image with the repository. My repository in Docker Hub is gnunn so that is what I am using here, your repository will be different so please change gnunn to the name of your repository.

docker tag jenkins-slave-sbt-centos7:latest gnunn/jenkins-slave-sbt-centos7:latest

After that we can now push the image, again change gnunn to the name of your repository:

docker push gnunn/jenkins-slave-sbt-centos7:latest

Building the Application in OpenShift

At this point we are almost ready to start building our Scala application. The first thing we need to do is create a project in OpenShift for the example. We can do that using the example below calling the project scala-example.

oc new-project scala-example

Next we need to configure Jenkins to be aware of our new agent image. This can be done in a couple of different ways. The easiest but most manual way is to simply login into Jenkins and update the Kubernetes plugin to add this new agent image via the Jenkins console.

However I’m not a fan of this approach since it is manual and requires the Jenkins configuration to be updated every time a new instance of Jenkins is deployed. A better approach is to update the configuration via a configmap which the Kubernetes plugin in Jenkins will use to set it’s configuration. An example configmap is shown below:

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app: cicd-pipeline
    role: jenkins-slave
  name: jenkins-slaves
data:
  sbt-template: |-
    <org.csanchez.jenkins.plugins.kubernetes.PodTemplate>
      <inheritFrom></inheritFrom>
      <name>sbt</name>
      <privileged>false</privileged>
      <alwaysPullImage>false</alwaysPullImage>
      <instanceCap>2147483647</instanceCap>
      <idleMinutes>0</idleMinutes>
      <label>sbt</label>
      <serviceAccount>jenkins</serviceAccount>
      <nodeSelector></nodeSelector>
      <customWorkspaceVolumeEnabled>false</customWorkspaceVolumeEnabled>
      <workspaceVolume class="org.csanchez.jenkins.plugins.kubernetes.volumes.workspace.EmptyDirWorkspaceVolume">
        <memory>false</memory>
      </workspaceVolume>
      <volumes />
      <containers>
        <org.csanchez.jenkins.plugins.kubernetes.ContainerTemplate>
          <name>jnlp</name>
          <image>docker.io/gnunn/jenkins-slave-sbt-centos7</image>
          <privileged>false</privileged>
          <alwaysPullImage>false</alwaysPullImage>
          <workingDir>/tmp</workingDir>
          <command></command>
          <args>${computer.jnlpmac} ${computer.name}</args>
          <ttyEnabled>false</ttyEnabled>
          <resourceRequestCpu>200m</resourceRequestCpu>
          <resourceRequestMemory>512Mi</resourceRequestMemory>
          <resourceLimitCpu>2</resourceLimitCpu>
          <resourceLimitMemory>4Gi</resourceLimitMemory>
          <envVars/>
        </org.csanchez.jenkins.plugins.kubernetes.ContainerTemplate>
      </containers>
      <envVars/>
      <annotations/>
      <imagePullSecrets/>
    </org.csanchez.jenkins.plugins.kubernetes.PodTemplate>

Please note this line in the configuration:

<image>docker.io/gnunn/jenkins-slave-sbt-centos7</image>

The reference here will need to be updated for the registry and repository where you pushed the image. If you are using Docker Hub, again change gnunn to your repository. Once you have updated that line, save it into a file called jenkins-slaves.yaml. We can then add the configmap to our project in OpenShift using the following command:

oc create -f jenkins-slaves.yaml

In order to test the build process we are going to need an application. I’ve forked a simple Scala microservice application to the github repo akka-http-microservice. I’ve made some minor modifications to it, notably updating the versions of Scala and SBT as well as adding the assembly plugin to create a fat jar.

First we will create a new build-config that the pipeline will use to feed the generated Scala artifact into the Red Hat JDK image. This build-config will still use source-to-image but will do so using a binary build since we will have already have built the Scala JAR file using our build agent.

If you are using Red Hat’s OpenShift Container Platform, you can create the build using the existing Java S2I image that Red Hat provides:

oc new-build redhat-openjdk18-openshift:1.2 --name=scala-example --binary=true

If you are using the opensource OpenShift OKD where the above image is not available, you can use the Fabric8 opensource Java S2I image.

oc create -f https://raw.githubusercontent.com/gnunn1/openshift-notes/master/fabric8-s2i-java/imagestream.yaml
oc new-build fabric8-s2i-java:3.0-java8 --name=scala-example --binary=true

Next we create a new application around the build, expose it to the outside word via a route and disable triggers since we want the pipeline to control the deployment:

oc new-app scala-example --allow-missing-imagestream-tags
oc expose dc scala-example --port=9000
oc expose svc scala-example
oc set triggers dc scala-example --containers='scala-example' --from-image='scala-example:latest' --manual=true

Note we specify --allow-missing-imagestream-tags because no images have been created at this point and thus the imagestream has no tags associated with it.

Now we need to create an OpenShift pipeline with a corresponding pipeline in a Jenkins file, below is the pipeline this example will be using.

---
apiVersion: v1
kind: BuildConfig
metadata:
  annotations:
    pipeline.alpha.openshift.io/uses: '[{"name": "jenkins", "namespace": "", "kind": "DeploymentConfig"}]'
  labels:
    app: scala-example
    name: scala-example
  name: scala-example-pipeline
spec:
  triggers:
    - type: GitHub
      github:
        secret: secret101
    - type: Generic
      generic:
        secret: secret101
  runPolicy: Serial
  source:
    type: Git
    git:
      uri: 'https://github.com/gnunn1/akka-http-microservice'
      ref: master
  strategy:
    jenkinsPipelineStrategy:
      jenkinsfilePath: Jenkinsfile
    type: JenkinsPipeline

Save the pipeline as pipeline.yaml and then create it in OpenShift using the following command, a Jenkins instance will be deployed automatically when you create the pipeline:

cc create -f pipeline.yaml

Note that the pipeline above is referencing a Jenkinsfile that is included in the source repository in git, you can view it here. This Jenkins file contains a simple pipeline that simply compiles the application, creates the image and then deploys it. For simplicity in this example we will do it all in the same project where the pipeline lives but typically you would deploy the application into separate projects representing individual environments (DEV, QA, PROD, etc) and not in the same project as Jenkins and the pipeline.

At this point you can now build the application, got to the Builds > Pipelines menu on the left and select the scala-example-pipeline and click the Start Pipeline button.

Scala Pipeline

At this point the pipeline will start though it may take a couple of minutes to start and perform all of it’s tasks so be patient, however once the pipeline is complete you will see the application running.

Scala Application Running

Once the pipeline is complete the application should be running and can be tested, to do so click on the route. You will see an error because the service doesn’t expose a service at the context root, add /ip/8.8.8.8 at the end of the URL and you should see a response as per below.

Scala Application Output

Running WebLogic on OpenShift

openshift-weblogic

From time to time I have customers expressing interest in running WebLogic on OpenShift. Having worked extensively with WebLogic in my previous career as an Oracle consultant, it’s immediately obvious that there are a number of copmplexities in getting the WebLogic domain structure working in a dynamic environment like OpenShift or Kubernetes. To Oracle’s credit, they are working on this and have written an article “WebLogic on Kubernetes, Try It!” with an example based on the Oracle docker images.

This post will discuss the basic steps needed to get this working in OpenShift. While OpenShift is based on Kubernetes, there are some differences particularly around security that require some tweaks. Additionally, the article and associated sample has some issues associated with it that I wanted to cover.

The instructions in the article are clear for the most part, here are the items that require tweaking to work:

a. Before you create the docker image in the wls-12213-domain directory, you will need to edit the file container-scripts/provision-domain.py. This file creates the managed servers ahead of time to support the statefulset, however it creates the listen address as ms-X.wls-subdomain.default.svc.cluster.local. The issue here is that it is hard-coding the default namespace which means you would need to deploy this sample in the default project in OpenShift. In theory, changing it to ms-X.wls-subdomain should work, however it did not for me and I need to investigate further. For now, I changed default to weblogic and deployed the sample into a weblogic project.

b. I also had an issue creating the docker file with the chown failing, see the issue I opened here for a workaround.

c. When creating the docker file, the sample implies you can select an admin user but you can’t. Just use the default weblogic admin user.

d. The Oracle docker images must be run as the Oracle user. OpenShift by default disallows images running as specific users, therefore a service account must be created to grant the anyuid scc.

oc create serviceaccount weblogic
oc adm policy add-scc-to-user anyuid -z weblogic

e. The wls-admin-webhook.yml and wls-stateful.yml files need to be updated to use the weblogic service account:

spec:
containers:
...
serviceAccount: weblogic
serviceAccountName: weblogic

f. The sample exposes the services using NodePort which should work in OpenShift, however I ended up creating routes as follows:

oc expose svc wls-admin-server
oc expose svc wls-service

g. If you are not familiar with WebLogic, to access the web console use the wls-admin-server route and add /console to the end of it. The username will be weblogic and the password whatever you selected.

Obviously the Oracle work is just a sample and a number of things would need to happen to operationalize it. The biggest IMHO is not having to hardcode the OpenShift namespaces the image will be running in. I’m hoping to resolve this as I play around with it further.