nabokihms July 18, 2019 at 09:27

Kubernetes Operator in Python without frameworks and SDKs

Tutorial

Go is currently a monopolist among the programming languages that people choose to write statements for Kubernetes. There are such objective reasons as:

There is a powerful framework for developing operators on Go - Operator SDK .
Go has written upside-down applications like Docker and Kubernetes. To write your own operator in Go - speak the same language with the ecosystem.
High performance applications on Go and simple tools for working with concurrency out of the box.

NB : By the way, we already described how to write your own operator on Go in one of our translations by foreign authors.

But what if learning Go is prevented by a lack of time or, trivially, motivation? The article provides an example of how you can write a solid operator using one of the most popular languages that almost every DevOps engineer knows - Python .

Meet: Copywriter - copy operator!

For example, consider the development of a simple operator designed to copy ConfigMap either when a new namespace appears, or when one of the two entities changes: ConfigMap and Secret. From the point of view of practical application, the operator can be useful for mass updating of application configurations (by updating ConfigMap) or for updating secret data - for example, keys for working with the Docker Registry (when adding Secret to the namespace).

So what a good operator should have :

Interaction with the operator is carried out using Custom Resource Definitions (hereinafter - CRD).
The operator can be customized. To do this, we will use command line flags and environment variables.
The assembly of the Docker container and the Helm chart is being worked out so that users can easily (literally with one command) install the operator in their Kubernetes cluster.

CRD

In order for the operator to know what resources and where to look for him, we need to set a rule for him. Each rule will be represented as a single CRD object. What fields should this CRD have?

The type of resource we are looking for (ConfigMap or Secret).
A list of namespace where resources should be located.
Selector , by which we will look for resources in namespace.

We describe the CRD:

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: copyrator.flant.com
spec:
  group: flant.com
  versions:
  - name: v1
    served: true
    storage: true
  scope: Namespaced
  names:
    plural: copyrators
    singular: copyrator
    kind: CopyratorRule
    shortNames:
    - copyr
  validation:
    openAPIV3Schema:
      type: object
      properties:
        ruleType:
          type: string
        namespaces:
          type: array
          items:
            type: string
        selector:
          type: string

And immediately create a simple rule - to search in namespace with the name of defaultall ConfigMap with labels of the form copyrator: "true":

apiVersion: flant.com/v1
kind: CopyratorRule
metadata:
  name: main-rule
  labels:
    module: copyrator
ruleType: configmap
selector:
  copyrator: "true"
namespace: default

Done! Now you need to somehow get information about our rule. I must make a reservation right away that we will not write requests to the cluster server API. To do this, we will use the ready-made Python library kubernetes-client :

import kubernetes
from contextlib import suppress
CRD_GROUP = 'flant.com'
CRD_VERSION = 'v1'
CRD_PLURAL = 'copyrators'
def load_crd(namespace, name):
    client = kubernetes.client.ApiClient()
    custom_api = kubernetes.client.CustomObjectsApi(client)
    with suppress(kubernetes.client.api_client.ApiException):
        crd = custom_api.get_namespaced_custom_object(
            CRD_GROUP,
            CRD_VERSION,
            namespace,
            CRD_PLURAL,
            name,
        )
    return {x: crd[x] for x in ('ruleType', 'selector', 'namespace')}

As a result of this code, we get the following:

{'ruleType': 'configmap', 'selector': {'copyrator': 'true'}, 'namespace': ['default']}

Excellent: we managed to get a rule for the operator. And most importantly, we did what is called the Kubernetes way.

Environment variables or flags? We take everything!

We pass to the main configuration of the operator. There are two basic approaches to configuring applications:

Use command line options
use environment variables.

Command line options allow you to read settings more flexibly, with support and validation of data types. The Python standard library has a module argparserthat we will use. Details and examples of its capabilities are available in the official documentation .

Here's what the example of setting command line flags reading will look like for our case:

   parser = ArgumentParser(
        description='Copyrator - copy operator.',
        prog='copyrator'
    )
    parser.add_argument(
        '--namespace',
        type=str,
        default=getenv('NAMESPACE', 'default'),
        help='Operator Namespace'
    )
    parser.add_argument(
        '--rule-name',
        type=str,
        default=getenv('RULE_NAME', 'main-rule'),
        help='CRD Name'
    )
    args = parser.parse_args()

On the other hand, using environment variables in Kubernetes, you can easily transfer service information about the pod to the container. For example, we can get information about the namespace in which pod is running with the following construction:

env:
- name: NAMESPACE
  valueFrom:
     fieldRef:
         fieldPath: metadata.namespace

Operator Logic

To understand how to separate the methods for working with ConfigMap and Secret, we will use special cards. Then we can understand what methods we need to track and create the object:

LIST_TYPES_MAP = {
    'configmap': 'list_namespaced_config_map',
    'secret': 'list_namespaced_secret',
}
CREATE_TYPES_MAP = {
    'configmap': 'create_namespaced_config_map',
    'secret': 'create_namespaced_secret',
}

Next, you need to receive events from the API server. We implement it as follows:

def handle(specs):
    kubernetes.config.load_incluster_config()
    v1 = kubernetes.client.CoreV1Api()
    # Получаем метод для слежения за объектами
    method = getattr(v1, LIST_TYPES_MAP[specs['ruleType']])
    func = partial(method, specs['namespace'])
    w = kubernetes.watch.Watch()
    for event in w.stream(func, _request_timeout=60):
        handle_event(v1, specs, event)

After receiving the event, we proceed to the main logic of its processing:

# Типы событий, на которые будем реагировать
ALLOWED_EVENT_TYPES = {'ADDED', 'UPDATED'}
def handle_event(v1, specs, event):
    if event['type'] not in ALLOWED_EVENT_TYPES:
        return
    object_ = event['object']
    labels = object_['metadata'].get('labels', {})
    # Ищем совпадения по selector'у
    for key, value in specs['selector'].items():
        if labels.get(key) != value:
            return
    # Получаем активные namespace'ы
    namespaces = map(
        lambda x: x.metadata.name,
        filter(
            lambda x: x.status.phase == 'Active',
            v1.list_namespace().items
        )
    )
    for namespace in namespaces:
        # Очищаем метаданные, устанавливаем namespace
        object_['metadata'] = {
            'labels': object_['metadata']['labels'],
            'namespace': namespace,
            'name': object_['metadata']['name'],
        }
        # Вызываем метод создания/обновления объекта
        methodcaller(
            CREATE_TYPES_MAP[specs['ruleType']],
            namespace,
            object_
        )(v1)

Basic logic is ready! Now you need to pack all this into one Python package. We design the file setup.py, write meta-information about the project there:

from sys import version_info
from setuptools import find_packages, setup
if version_info[:2] < (3, 5):
    raise RuntimeError(
        'Unsupported python version %s.' % '.'.join(version_info)
    )
_NAME = 'copyrator'
setup(
    name=_NAME,
    version='0.0.1',
    packages=find_packages(),
    classifiers=[
        'Development Status :: 3 - Alpha',
        'Programming Language :: Python',
        'Programming Language :: Python :: 3',
        'Programming Language :: Python :: 3.5',
        'Programming Language :: Python :: 3.6',
        'Programming Language :: Python :: 3.7',
    ],
    author='Flant',
    author_email='maksim.nabokikh@flant.com',
    include_package_data=True,
    install_requires=[
        'kubernetes==9.0.0',
    ],
    entry_points={
        'console_scripts': [
            '{0} = {0}.cli:main'.format(_NAME),
        ]
    }
)

NB : The kubernetes client for Python has its own versioning. You can learn more about compatibility between client versions and Kubernetes versions from the compatibility matrix .

Now our project looks like this:

copyrator
├── copyrator
│   ├── cli.py # Логика работы с командной строкой
│   ├── constant.py # Константы, которые мы приводили выше
│   ├── load_crd.py # Логика загрузки CRD
│   └── operator.py # Основная логика работы оператора
└── setup.py # Оформление пакета

Docker and Helm

The Dockerfile will be outrageously simple: take the basic python-alpine image and install our package. We will postpone its optimization until better times:

FROM python:3.7.3-alpine3.9
ADD . /app
RUN pip3 install /app
ENTRYPOINT ["copyrator"]

Deployment for the operator is also very simple:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ .Chart.Name }}
spec:
  selector:
    matchLabels:
      name: {{ .Chart.Name }}
  template:
    metadata:
      labels:
        name: {{ .Chart.Name }}
    spec:
      containers:
      - name: {{ .Chart.Name }}
        image: privaterepo.yourcompany.com/copyrator:latest
        imagePullPolicy: Always
        args: ["--rule-type", "main-rule"]
        env:
        - name: NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
      serviceAccountName: {{ .Chart.Name }}-acc

Finally, you must create the appropriate role for the operator with the necessary rights:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: {{ .Chart.Name }}-acc
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: {{ .Chart.Name }}
rules:
  - apiGroups: [""]
    resources: ["namespaces"]
    verbs: ["get", "watch", "list"]
  - apiGroups: [""]
    resources: ["secrets", "configmaps"]
    verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: {{ .Chart.Name }}
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: {{ .Chart.Name }}
subjects:
- kind: ServiceAccount
  name: {{ .Chart.Name }}-acc

Total

So, without fear, reproach and learning Go, we were able to put together our own operator for Kubernetes in Python. Of course, he still has room to grow: in the future, he will be able to process several rules, work in several threads, independently monitor the changes of his CRD ...

So that we can get to know the code better, we put it in a public repository . If you want examples of more serious operators implemented using Python, you can turn your attention to two operators for deploying mongodb (the first and second ).

PS And if you are too lazy to deal with Kubernetes events or if you’re just more used to using Bash, our colleagues have prepared a ready-made solution in the form of a shell-operator (weannounced it in April).