piątek, 17 grudnia 2021

3 ways to collect metrics from OpenShift built in Prometheus

In this blog post I'll describe 3 methods of collecting metrics data from OpenShift 4 built in Prometheus. This might be especially useful if you are running a central monitoring solution outside of OpenShift and you would like to integrate it with metrics collected by OpenShift built in Prometheus.

1. Thanos Querier

The Thanos Querier aggregates and optionally deduplicates core OpenShift Container Platform metrics and metrics for user-defined projects under a single, multi-tenant interface. Thanos Querier expose route which can be queried by authorized clients using promql semantics. In order to authorize requests you must provide a bearer token belonging to a user or service account which has at minimum cluster-monitoring-view OpenShift role granted. 

Here is an example how to create service account with cluster-monitoring-view role and query Thanos Querier endpoint:

oc project openshift-monitoring
SA=querier
oc create sa $SA
oc adm policy add-cluster-role-to-user cluster-monitoring-view -z $SA

TOKEN=$(oc sa get-token $SA -n openshift-monitoring)
URL=$(oc get route thanos-querier --template='{{.spec.host}}' -n openshift-monitoring)
QUERY=node_cpu_seconds_total

curl -k -H "Authorization: Bearer $TOKEN" https://$URL/api/v1/query?query=$QUERY

2. Prometheus Federation

Federation allows a Prometheus server to scrape selected time series from another Prometheus server. Each Prometheus instance exposes /federate endpoint which might be queried using the exposed OpenShift route. For request authorization the same rules apply as for Thanos Querier authorization:

URL=$(oc get route prometheus-k8s --template='{{.spec.host}}' -n openshift-monitoring)

QUERY='match[]={__name__=~"node_cpu_seconds_total|node_memory_MemAvailable_bytes"}'


curl -G --data-urlencode "$QUERY" -k -H "Authorization: Bearer $TOKEN" https://$URL/federate

3. Remote write

Both methods described above require external systems to periodically pull metrics from Prometheus. On the contrary, remote write allows you to push metrics from Prometheus to remote systems.

Remote write configuration must be done in the cluster-monitoring-config config map located in openshift-monitoring namespace:

apiVersion: v1

kind: ConfigMap

metadata:

  name: cluster-monitoring-config

  namespace: openshift-monitoring

data:

  config.yaml: |

    prometheusK8s:

      remoteWrite:

      - url: "https://remote-write.endpoint"

        writeRelabelConfigs:

        - sourceLabels: [__name__]

          regex: 'node_cpu_seconds_total|node_memory_MemAvailable_bytes'

          action: keep

Above you can see a fairly simple configuration which will push only 2 filtered metrics data to the remote endpoint in default intervals of 1 minute. There is much more configuration possible as per Prometheus documentation. All these configurations can be created in the cluster-monitoring-config config map, but please note you must follow naming conventions according to Prometheus Operator specification which is slightly different from Prometheus documentation.