środa, 15 marca 2023

Sustainable Computing in OpenShift

As per this blog sustainable computing concerns the consumption of computing resources in a way that means it has a net zero impact on the environment, a broad concept that includes energy, ecosystems, pollution and natural resources. 

Can we do sustainable computing in OpenShift?

Yes, we can! 

Meet the Kepler project. Kepler exposes a variety of metrics about the energy consumption of Kubernetes components such as Pods and Nodes. 

In this blog I'll describe my initial experience with deploying and using Kepler on top of OpenShift clusters.

I've installed Kepler using Helm chart, however they are also working actively on the Kepler Operator which most probably sooner or later will be the preferred installation method in OpenShift.

$ git clone https://github.com/sustainable-computing-io/kepler-helm-chart
$ cd git/kepler-helm-chart/

At this point it makes sense to review and modify values.yaml and adjust the configuration to your needs. I did some minor changes which you can review here.

$ helm install kepler . --values values.yaml  --create-namespace  --namespace kepler

Next you'll need to grant the kepler service account necessary SCC permissions and bind it to the kepler-exported daemon set. These commands must be executed using a cluster-admin account.

$ oc adm policy add-scc-to-user privileged -z kepler

$ oc patch ds/kepler-exporter --patch '{"spec":{"template":{"spec":{"serviceAccountName": "kepler"}}}}'

Optionally you can create a dedicated SCC using this example and add it to the kepler service account as I did above.

Now you should wait until kepler exporter pods are running on each node as per daemon set configuration.

$ oc get pods -n kepler
NAME                    READY   STATUS    RESTARTS   AGE
kepler-exporter-2k5cx   1/1     Running   0          14h
kepler-exporter-8ctd5   1/1     Running   0          17h
kepler-exporter-cqq9d   1/1     Running   0          17h

By default kepler exporter pods expose Prometheus metrics at /metrics uri. You can learn more about Kepler metrics here. In OpenShift to allow scraping these metrics by Prometheus you must first enable user workload monitoring as per the documentation. Next you can configure Service Monitor in kepler project. Just remember to put the kepler project name at the bottom. Once it is done you can query the metrics using PromQL. 

For example this query will show top power consuming pods in your cluster.

topk(10, kepler_container_joules_total)

Returned values are measured in Joules which can be converted to Watts. Since 1 Watt = 1 Joule per second you’ll need to use the rate() function which gives the power in Watts since the rate function returns the average per second. Therefore, to get the container energy consumption in Watts you can use the following query:

sum by (pod_name, container_name, container_namespace, node) (irate(kepler_container_joules_total{}[1m]))
