Kubernetes 1.15: Highlights Overview
Monday was supposed to take place officially (
The information used to prepare this material is taken from the Kubernetes enhancements tracking table , CHANGELOG-1.15and related issues, pull requests, and Kubernetes Enhancement Proposals (KEP). Since the KubeCon conference in Shanghai will be held next week, this release had a shortened 11-week cycle (instead of 12 weeks), which, however, did not significantly affect the number of significant changes. So let's go! ..
A new API has been
ExecutionHookintroduced that allows you to dynamically execute user commands in a pod / container or group of pods / containers, and with it the corresponding controller (
ExecutionHookController) that implements the hook's life cycle management. The motivation for the appearance of this feature was the desire to provide users with the ability to create / delete snapshots in accordance with the logic of the application, i.e. execute any application-specific commands before and after creating the snapshot. It is assumed that such hooks can also be useful in other situations - for example, performing updates, debugging, updating configuration files, restarting the container, preparing for other events like database migration. Current status - alpha version (expected to be translated to beta for the next release), details - inKEP .
In ephemeral-storage , which allows you to differentiate for specific pods / containers the volume of shared shared space (shared storage) , support for file system quotas has been added . This new mechanism uses project quotas ( project quotas ) available in XFS and ext4, providing monitoring of resource consumption and the optional imposition of limits on them. Current status - alpha version; plans for future releases have not yet been specified.
Another new feature introduced by sig-storage is the use of existing PVCs
DataSourceto create new PVCs. In other words, this is an implementation of the volume cloning function.. Clones should be distinguished from snapshots, since each clone is a new and “full-fledged” volume: it is created as a copy of an existing one, but it fully follows the life cycle of ordinary volumes (unlike snapshots, although they are copies of volumes at a certain point in time, but not independent volumes). Opportunity Illustration:
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: pvc-2 namespace: myns spec: capacity: storage: 10Gi dataSource: kind: PersistentVolumeClaim name: pvc-1
In this case, a new standalone PV / PVC (
pvc-2) will be created containing the same data as on
pvc-1. It is indicated that the new PVC should have the same namespace as the original.
The existing limitations are support only for dynamic provisioner and only for CSI plug-ins (they must have capability
CLONE_VOLUME). Read more at KEP .
The following features have "grown" to the status of the beta version (and, therefore, the activation in Kubernetes default installations):
- Persistent Volume size expansion function “online”, i.e. without the need to restart the pod using the appropriate PVC. For the first time (in alpha status), it appeared in K8s 1.11.
- Support for environment variables for directory names mounted as
subPath, which was first introduced in K8s 1.11 and was developed in the past 1.14.
But the process of migrating the internals of the old plug-ins for repositories implemented inside the Kubernetes (in-tree) code base has dragged on in favor of the plug-ins for the new CSI. It was expected that the release of 1.15 will complete the migration of all the plug-ins of cloud providers, however, it was decided to keep the alpha version status, since the feature depends on the APIs introduced in K8s 1.15 and so far only implemented in the alpha version (in particular, we are talking about improvements in Azure support: Azure File and Azure Disk plugins in csi-translation-lib).
Two notable innovations - both in alpha form so far - are available in the Kubernetes Scheduler.
The first is the scheduling framework ( Kubernetes Scheduling Framework ), which is a new set of APIs for plugins that extend the capabilities of an existing scheduler. Plugins are created outside the main repository (out-of-tree), but are included in the scheduler during compilation. Thus, the functional core of the scheduler remains as simple and convenient to support as possible, and additional features are implemented separately, without the many restrictions that the current way of expanding the features of the scheduler "suffered" with the help of webhooks.
In the new framework, each pod scheduling attempt is divided into two stages:
- planning (scheduling cycle) - where the host for pod'a selected
- and binding (binding cycle) - where the selected solution is implemented within the cluster.
At each of these stages (together they are also called the scheduling context ), there are many extension points , at each of which framework plugins can be called.
(Life cycle for calling plugins in the Scheduling Framework.)
Only the Reserve , Unreserve, and Prebind points are implemented within the alpha version of the framework . Read more about this massive innovation at KEP .
The second is the Non-Preempting option for
Priority classes received stable (GA) status in the last Kubernetes release, which affected the planning and selection of pods: pods are planned according to priority, and if pod cannot be created due to lack of resources, then pods lower priority can be crowded out to free up the necessary space.
A new option
Preempting, defined as a Boolean in the structure
PriorityClass, means: if a pod is waiting for its planning and has one
Preempting=false, its creation will not lead to crowding out other pods. This field appears in
PodSpecthe pod admission process (similar to the value
PriorityClass). Details of the implementation are in KEP .
For CustomResources , improvements are presented that are designed to implement for data stored in this way (within the framework of JSON in CRD) a behavior that better matches the generally accepted Kubernetes API (for "native" K8s objects):
- automatic deletion of fields not specified in OpenAPI validation schemes - for details, see KEP “ Pruning for Custom Resources ”;
- the ability to set default values for fields in OpenAPI v3 validation schemes , which is especially important for maintaining API compatibility when adding new fields to objects, for details see KEP “ Defaulting for Custom Resources ”.
Both features were originally planned to be included in the release of K8s 1.12, but only now they are presented in alpha versions.
The changes in CRD were not limited to this:
- Publish CRD OpenAPI feature - i.e. server-side CustomResources validation (using OpenAPI v3 scheme) introduced in the last Kubernetes release - reached beta and is now enabled by default;
- The version conversion mechanism for CRD resources based on external webhooks has also been converted to beta.
Another interesting innovation is called Watch bookmark . Its essence boils down to adding a new type of event to the Watch API -
Bookmark. This type means a label that all objects up to a certain one
resourceVersionhave already been processed by the watch. Such a mechanism will reduce the load on kube-apiserver, reducing the number of events that need to be processed each time the watch is restarted, as well as reducing the number of unwanted errors like “resource version too old” . In Kubernetes 1.15, the feature has the status of the alpha version, and its increase to beta is expected for the next release.
Added EventType = "ADDED" Modified EventType = "MODIFIED" Deleted EventType = "DELETED" Error EventType = "ERROR" Bookmark EventType = "BOOKMARK"
(Possible event types in the Watch API.)
In Admission Webhooks:
- Added support for the selector objects ( object selector ) in addition to existing selectors namespaces;
- implemented the ability to register a specific version of a resource and call when any older version of this resource is modified;
- a field has been added to the AdmissionReview API
Optionreporting options for the operation being performed.
A significant innovation in the network part of Kubernetes is the so-called " Finalizer Protection" for load balancers. Now before deleting the resources of the LoadBalancer, it is checked that the corresponding Service resource has not been completely deleted. To do this, the
type=LoadBalancerso-called finalizer is attached to each service with : when deleting such a service, the real deletion of the resource is blocked until the finalizer is deleted, and the finalizer itself is not deleted until the "erasure" of the resources of the corresponding load balancer is completed (
service.kubernetes.io/load-balancer-cleanup). The current version of the implementation is the alpha version, and details about it can be found in KEP .
- The NodeLocal DNS Cache plugin introduced in Kubernetes 1.13 and improving DNS performance has reached beta.
- Kube-proxy no longer automatically removes network rules created as a result of its operation in other modes (this requires an explicit launch
As always, there were some nice little things in console commands for working with Kubernetes clusters:
- The transfer
kubectl getto receive data from the server (and not the client) for the full support of extensions is declared complete (stable).
kubectl topadded option
$ kubectl --kubeconfig=kubectl.kubeconfig top pod --sort=memory NAME CPU(cores) MEMORY(bytes) elasticsearch-logging-v1-psc43 2m 2406Mi hadoop-journalnode-2 13m 362Mi hodor-v0.0.5-3204531036-fqb0q 23m 64Mi kubernetes-admin-mongo-... 5m 44Mi cauth-v0.0.5-2463911897-165m8 34m 10Mi test-1440672787-kvx8h 0m 1Mi
kubectl rollout restartadded support and DaemonSets StatefulSets.
- A new command has been added
kubeadm upgrade nodefor updating cluster nodes, replacing (now declared obsolete)
kubeadm upgrade node configand
kubeadm upgrade node experimental-control-plane.
- New commands have been added
kubeadm alpha certs certificate-key(for generating a random key, which can then be passed to
kubeadm init --experimental-upload-certs) and
kubeadm alpha certs check-expiration(for checking the validity of local PKI certificates).
- The team has
kubeadm config uploadbeen deprecated because its replacement (
kubeadm init phase upload-config) has matured.
Among other notable changes in Kubernetes 1.15:
- Support for Pod Disruption Budget (PDB) has been added for third-party CRD-based resources / controllers (for example: EtcdCluster, MySQLReplicaSet ...) using the Scale subresource . So far this is a beta version that will be made stable in the next release. Details are in the KEP .
- Two features for nodes / Kubelet reached the beta version: support for third-party plug-ins for device monitoring (in order to remove all device-specific knowledge from Kubelet, i.e. out-of-tree) and
SupportNodePidsLimit(isolation of PIDs from the node to the pod 'am).
- Support for Go Modules has been added and enabled by default for the Kubernetes code base (instead of Godep and the GOPATH mode, which is deprecated).
- Support for AWS NLB (Network Load Balancer), first introduced back in K8s 1.9, has reached beta level. In particular, she got the ability to configure
accessLogs, terminate and TLS
- Implemented the ability to configure the Azure cloud provider from Kubernetes secrets (a new option has been added for this
cloudConfigType, one of which can be one of the values
secret). Also, Kubelet in Azure can now work without Azure identity (must be enabled for this
- In cluster-lifecycle, the possibility of creating HA clusters using kubeadm was brought to beta , and they also completed the next step (v1beta2) in reorganizing the format of the kubeadm configuration file.
- The number of pods in the pending status in different queues is added to the metrics from the scheduler , and statistics on volumes through kubelet volume metrics from CSI are added.
- Updates in the used / dependent software: Go 1.12.5, cri-tools 1.14.0, etcd 3.3.10 (the version has not changed for the server, but has been updated for the client) . The versions of CNI, CSI, CoreDNS have not changed (in one of the alpha versions of Kubernetes 1.15 it was updated to 1.5.0, but then rolled back to 1.3.1) , supported versions of Docker.
Read also in our blog: