Simplifying OpenShift MachineSet Management Using Kyverno

Using Kyverno to mutate OpenShift MachineSet resources for easier automation.

(Guest post from Red Hat Distinguished Architect, Andrew Block)

Managing infrastructure in a declarative fashion is one of the core principles that should be adopted when operating in any environment. In OpenShift, this paradigm for managing the underlying Node infrastructure is accomplished using the Machine API. This extension of the upstream Cluster API project enables the provisioning and management of instances once the OpenShift cluster finishes deploying.

While Machines are individual hosts provisioned as Nodes, cluster administrators typically interact with them via an abstraction – a MachineSet. A MachineSet represents a group of compute instances that not only share similar traits, such as the definition of the desired cloud provider, but they can be scaled based on the desired number of instances.

While MachineSets provide a mechanism for managing Machines at scale and represent a set of purpose-built instances within an OpenShift environment, there are limitations that inhibit one from being able to fully manage them declaratively.

Challenges Surrounding Declarative MachineSets

MachineSets are used within OpenShift in environments which are integrated with a cloud provider, such as Amazon Web Services, Microsoft Azure, and VMware vSphere. Even though there are specific differences as they relate to the individual provider being used, the remainder of the MachineSet definition is the same.

The following is an example of a MachineSet that could be used to represent OpenShift Infrastructure nodes:

 1apiVersion: machine.openshift.io/v1beta1
 2kind: MachineSet
 3metadata:
 4  labels:
 5    machine.openshift.io/cluster-api-cluster: <infrastructure_id> 
 6    machine.openshift.io/cluster-api-machine-role: infra 
 7    machine.openshift.io/cluster-api-machine-type: infra 
 8  name: <infrastructure_id>-infra 
 9  namespace: openshift-machine-api
10spec:
11  replicas: 1
12  selector:
13    matchLabels:
14      machine.openshift.io/cluster-api-cluster: <infrastructure_id> 
15      machine.openshift.io/cluster-api-machineset: <infrastructure_id>-infra 
16  template:
17    metadata:
18      labels:
19        machine.openshift.io/cluster-api-cluster: <infrastructure_id> 
20        machine.openshift.io/cluster-api-machine-role: infra 
21        machine.openshift.io/cluster-api-machine-type: infra
22        machine.openshift.io/cluster-api-machineset: <infrastructure_id>-infra
23    spec:
24      metadata:
25        labels:
26          node-role.kubernetes.io/infra: ""
27      providerSpec:
28        # Provider specific implementation
29        ...

While the majority of the MachineSet definition is straightforward, notice the placeholder value denoted by <infrastructure_id>. An OpenShift Infrastructure ID refers to the cluster ID and is a value that is generated at cluster installation that contains the name of the cluster appended by a randomly generated value. Since this value is distinct per cluster, it becomes a challenge to specify the MachineSet definition which could be managed via GitOps prior to cluster creation as one would not be able to ascertain the generated cluster ID.

Several workarounds have been implemented within the community to address this challenge and range from populating GitOps repositories dynamically as the cluster is being provisioned, to a dedicated operator which dynamically updates the MachineSet after it is created. However, instead of leveraging a workaround or needing to deploy an operator just to inject a single property, other options should be considered.

Overcoming MachineSet Limitations Using Kyverno

Even though these approaches described in the prior section do work, they either require additional processes to be performed or limit the types of GitOps management styles that can be implemented. Wouldn’t it be great if the contents of a MachineSet were updated as they are created in the cluster without either having to change how the MachineSet definitions are managed via GitOps or retroactively updating the MachineSet within the cluster after they are created? Fortunately, this challenge related to Infrastructure IDs can be overcome thanks to Kyverno!

Kyverno is a policy engine for Kubernetes that can be used to validate, mutate, generate, and cleanup Kubernetes resources as well as verify container images. Similar to most policy engines for Kubernetes, Kyverno makes use of Validating and Mutating Admission webhooks which enable the introspection of resources prior to being persisted within etcd.

So, given that Kyverno could be used to solve this challenge, what would this look like? Many of the other workarounds including those mentioned previously make use of looking up the Cluster ID from the value stored within the cluster. The Infrastructure ID is stored within a Custom Resource called Infrastructure which includes infrastructure-related details specific to the cluster. The Infrastructure ID can be retrieved by those with elevated cluster level access by executing the following command:

1$ oc get -o jsonpath='{.status.infrastructureName}{"\n"}' infrastructure cluster

With an understanding how to obtain the required information from the cluster, let’s see how Kyverno can be used to retrieve the Infrastructure ID from the cluster and inject it within the MachineSet as it is created.

Kyverno policies can make use of variables that are retrieved from the results from invoking Kubernetes API service calls. Given that the resource and the particular property within the resource is known containing the Infrastructure ID, the following in a context can be used to query the Infrastructure resource from within the cluster and set the variable named infraid based on the JMESPath expression for the obtained resource containing the Infrastructure ID:

1context:
2- name: cluster
3    apiCall:
4    urlPath: /apis/config.openshift.io/v1/infrastructures/cluster
5- name: infraid
6    variable:
7    jmesPath: cluster.status.infrastructureName

Notice how the jmesPath field references the cluster variable which represents the result from the API service call. Traversing through the data structure enables accessing the Infrastructure ID found within the infrastructureName property.

Now that Infrastructure ID has been assigned to a variable, the next step is to take this variable and inject it into the incoming MachineSet resource. Modifications to resources in Kyverno are achieved using one or more mutation rules. Mutations can leverage either a RFC 6902 JSON Patch or a strategic merge patch. While a JSON patch does provide the ability to perform fine-grained modification of resources, as shown in the MachineSet example in a prior section, the Infrastructure ID has the potential to be located at varying locations within the MachineSet definition. To manage a dynamic set of Infrastructure ID parameters regardless of their locations within a MachineSet, the Kyverno-specific replace_all filter can be used to accomplish this task.

1mutate:
2  patchesJson6902: |-
3    - op: replace
4      path: /metadata
5      value: {{ replace_all(to_string(request.object.metadata),'TEMPLATE', infraid) }}
6    - op: replace
7      path: /spec
8      value: {{ replace_all(to_string(request.object.spec),'TEMPLATE', infraid) }}    

The mutate rules above specify that all instances of the word “TEMPLATE” that are found within the spec and metadata properties of a MachineSet will be replaced by the infraid variable that was specified earlier.

Putting all of the pieces together, the final ClusterPolicy resource is shown below:

 1apiVersion: kyverno.io/v1
 2kind: ClusterPolicy
 3metadata:
 4  name: inject-infrastructurename
 5  annotations:
 6    policies.kyverno.io/title: Inject Infrastructure Name
 7    policies.kyverno.io/category: OpenShift
 8    policies.kyverno.io/severity: medium
 9    kyverno.io/kyverno-version: 1.10.0
10    policies.kyverno.io/minversion: 1.10.0
11    kyverno.io/kubernetes-version: "1.26"
12    policies.kyverno.io/subject: MachineSet
13    policies.kyverno.io/description: >-
14      A required component of a MachineSet is the infrastructure name which is a random string
15      created in a separate resource. It can be tedious or impossible to know this for each
16      MachineSet created. This policy fetches the value of the infrastructure name from the
17      Cluster resource and replaces all instances of TEMPLATE in a MachineSet with that name.      
18spec:
19  schemaValidation: false
20  rules:
21  - name: replace-template
22    match:
23      any:
24      - resources:
25          kinds:
26          - machine.openshift.io/v1beta1/MachineSet
27          operations:
28          - CREATE
29    context:
30    - name: cluster
31      apiCall:
32        urlPath: /apis/config.openshift.io/v1/infrastructures/cluster
33    - name: infraid
34      variable:
35        jmesPath: cluster.status.infrastructureName
36    mutate:
37      patchesJson6902: |-
38        - op: replace
39          path: /metadata
40          value: {{ replace_all(to_string(request.object.metadata),'TEMPLATE', infraid) }}
41        - op: replace
42          path: /spec
43          value: {{ replace_all(to_string(request.object.spec),'TEMPLATE', infraid) }}        

Assuming Kyverno is deployed to your OpenShift cluster, the ClusterPolicy can be added to enable the desired MachineSet functionality. All that needs to be done now is to update your existing MachineSet manifests that you have specified declaratively, such as in a GitOps repository, and replace the hard-coded Infrastructure ID with the word TEMPLATE. You are free to choose a word other than TEMPLATE to represent the value that should be replaced by the Infrastructure ID. When doing so, be sure to update the value in the ClusterPolicy and in the MachineSet definition.

MachineSets offer the advantage of defining and managing a set of OpenShift Machine profiles, but require that the Cluster ID represented as the Infrastructure ID be present within the definition. Thanks to the dynamic set of capabilities provided by Kyverno, managing MachineSets within OpenShift just got a whole lot easier. The ClusterPolicy shown previously is also available on Artifact Hub and the policy library for easy reference and consumption.