Kubernetes 1.10: Overview of Key Innovations

    At the end of March, Kubernetes 1.10 was released . Keeping our tradition of telling details about the most significant changes in the next Kubernetes release, we are publishing this review based on CHANGELOG-1.10 , as well as numerous issues, pull requests and design proposals. So, what's new with K8s 1.10?



    Storage facilities


    Mount propagation - the ability of containers to mount volumesrslaveso that mounted host directories are visible inside the container (valueHostToContainer), or howrsharedmounted container directories are visible on the host (valueBidirectional). Status - beta version ( documentation on the site). Not supported on Windows.

    Added the ability to create local persistent storage ( Local Persistent Storage ), i.e.PersistentVolumes(PVs) can now be not only network volumes, but also be based on locally connected drives. The innovation has two goals: a) improve performance (local SSDs have better speed than network drives), b) provide the ability to use cheaper storage on iron (bare metal) Kubernetes installations. These works will be introduced along with the creation of Ephemeral Local Storage, the restrictions / limits in which (first introduced in K8s 1.8 ) also received improvements in the next release - announced as a beta version and are now enabled by default.

    Available (in beta) “ volume planning with topology( Topology Aware Volume Scheduling )whose idea is that the standard Kubernetes scheduler knows (and takes into account) the limitations of the volume topology, and the PersistentVolumeClaimsscheduler decisions are taken into account in the process of binding (PVCs) to PVs. It is implemented in such a way that it can now request PVs, which should be compatible with its other limitations: resource requirements, affinity / anti-affinity policies. At the same time, planning of hearths that do not use constrained PVs should occur with the same performance. Details are in the design-proposals .

    Among other improvements in support for volumes / file systems:

    • Improvements in Ceph RBD with the ability to use the rbd-nbd client (based on the popular librbd library) in pkg / volume / rbd;
    • support for mounting Ceph FS through FUSE;
    • plugin for AWS EBS added support block and volumes volumeMode, as well as support of block volumes appeared in the plugin GCE PD;
    • the ability to resize a volume even when it is mounted.

    Finally, additional metrics were added (and declared stable) that talk about the internal state of the storage subsystem in Kubernetes and are intended for debugging, as well as an extended understanding of the state of the cluster. For example, now for each volume (by ) you can find out the total time for mount / umount and attach / detach operations, the total time of privision and removal, as well as the number of volumes in and , bound / unbound PVCs and PVs, the number of PVCs and etc. For more details, see the documentation .volume_pluginActualStateofWorldDesiredStateOfWorld

    Kubelet, nodes and their management


    Kubelet got the ability to configure through a versioned configuration file (instead of the traditional method with flags on the command line), which has a structure KubeletConfiguration. For Kubelet to pick up the config, you must run it with the flag --config(see the documentation for details ). This approach is called recommended because it simplifies the deployment of nodes and configuration management. This was made possible thanks to the emergence of an API group called kubelet.config.k8s.io, which, for the release of Kubernetes 1.10, has beta status. Example configuration file for Kubelet :

    kind: KubeletConfiguration
    apiVersion: kubelet.config.k8s.io/v1beta1
    evictionHard:
        memory.available:  "200Mi"

    With the new option in the hearth specification, shareProcessNamespacein PodSpec, containers can now use a common sub-namespace for processes (PID namespace) . Previously, this was not possible due to the lack of necessary support in Docker, which led to the emergence of an additional API, which has since been used by some container images ... Now everything has been unified , preserving backward compatibility. The result of the implementation is the support of three modes for separating PID namespaces in the Container Runtime Interface (CRI): for each container (i.e. its own namespace for each container), for the hearth (common namespace for the hearth containers), for the node. Ready status is alpha.

    Another significant change in CRI is the introduction of support for Windows Container Configuration . Until now, only Linux containers could be configured in CRI, but the OCI (Open Container Initiative, Runtime Specification ) specification also describes the features of other platforms - in particular, Windows . CRI now supports memory and processor limits for Windows containers (alpha version).

    In addition, three Resource Management Working Group developments have reached beta status:

    1. CPU Manager (assignment of specific processor cores - more about it was written in the article about K8s 1.8 );
    2. Huge Pages (the ability to use pods 2Mi and 1Gi Huge Pages, which is important for applications that consume large amounts of memory);
    3. Device Plugin (a framework for vendors that allows you to declare resources in kubelet : for example, from GPU, NIC, FPGA, InfiniBand, etc. - without having to modify the main Kubernetes code).

    The number of processes running in the hearth can now be limited by using the parameter --pod-max-pidsfor the kubelet console command . The implementation has the status of the alpha version and requires the inclusion of features SupportPodPidsLimit.

    Due to the fact that containerd 1.1 introduced native support for CRI v1alpha2, in Kubernetes 1.10 containerd 1.1 can be used directly, without the need for an “intermediary” cri-containerd (we wrote more about it at the end of this article ) . CRI-O also updated the CRI version to v1alpha2, and added support for specifying the GID of the container in LinuxSandboxSecurityContextand in CRI (Container Runtime Interface) itself LinuxContainerSecurityContext(in addition to UID) - support is implemented for dockershim and has alpha status.

    Network


    The option using CoreDNS instead of kube-dns has reached beta status. In particular, this made it possible to migrate to CoreDNS when upgrading using a kubeadm cluster using kube-dns : in this case, kubeadm will generate a CoreDNS configuration (i.e. Corefile) based on ConfigMapfrom kube-dns .

    Traditionally, /etc/resolv.confthe kubelet is controlled on the hearth , and the data of this config is generated based on pod.dnsPolicy. Kubernetes 1.10 (in beta) provides configurationresolv.conf support for the hearth . To do this PodSpec, the field is addeddnsParams, which allows you to rewrite existing DNS settings. Read more in design-proposals . Use illustration dnsPolicy: Customwith dnsParams:

    # Pod spec
    apiVersion: v1
    kind: Pod
    metadata: {"namespace": "ns1", "name": "example"}
    spec:
      ...
      dnsPolicy: Custom
      dnsParams:
        nameservers: ["1.2.3.4"]
        search:
        - ns1.svc.cluster.local
        - my.dns.search.suffix
        options:
        - name: ndots
          value: 2
        - name: edns0

    An option has been added to kube-proxy that allows you to determine the range of IP addresses for , i.e. initiate filtering of acceptable values ​​with (with a default value , i.e. skip everything that corresponds to the current behavior ). It provides for the implementation in kube-proxy for iptables, Linux userspace, IPVS, Window userspace, winkernel. Status - alpha version.NodePort--nodeport-addresses0.0.0.0/0NodePort

    Authentication


    Added new authentication methods (alpha version):

    1. external client providers : in response to long-standing requests of K8s users for exec-based plugins, kubectl (client-go) implemented support for executable plug-ins that can receive authentication data by executing an arbitrary command and reading its output (the GCP plugin can also be configured to call commands other than gcloud ). One application is that cloud providers will be able to create their own authentication systems (instead of using standard Kubernetes mechanisms);
    2. TokenRequest API for receiving JWT tokens (JSON Web Tokens) associated with clients (audience) and time.

    In addition, the stable status gained the ability to restrict the access of nodes to certain APIs (using the authorization mode Nodeand admission plug-in NodeRestriction) in order to issue them permission only for a limited number of objects and related secrets.

    CLI


    Progress has been made in refining the output shown by kubectl getand kubectl describe. The global objective of the initiative , which has received beta status in Kubernetes 1.10, is that the columns for tabular display of data should be received on the server side (and not the client), this is done in order to improve the user interface when working with extensions. The work that started earlier (in K8s 1.8) on the server side was brought to the beta level, and major changes were made on the client side.

    In kubectl port-forward , the ability to use the resource name to select the appropriate hearth (and the flag--pod-running-timeoutto wait for at least one sub to be launched), as well as support for specifying a service for port forwarding (for example:) kubectl port-forward svc/myservice 8443:443.

    New abbreviations for kubectl commands : cjinstead of CronJobs, crds- CustomResourceDefinition. For example, a team became available kubectl get crds.

    Other changes


    • Aggregation API i.e. Aggregation of user apiservers with the main API in Kubernetes, has received a stable status and is officially ready for use in production.
    • Kubelet and kube-proxy can now run as native services on Windows. Added support for Windows Service Control Manager (SCM) and experimental isolation support with Hyper-V for hearths with a single container.
    • The Persistent Volume Claim Protection ( PVCProtection) function, which “protects” against the possible removal of PVCs that are actively used by the hearths, has been renamed Storage Protectionand brought to the beta version.
    • Alpha version of Azure support in cluster-autoscaler .

    Compatibility


    • The supported version of etcd is 3.1.12. At the same time, etcd2 is declared obsolete as a backend; its support will be removed in Kubernetes 1.13 release.
    • Verified versions of Docker - from 1.11.2 to 1.13.1 and 17.03.x (have not changed since the release of K8s 1.9).
    • The Go version is 1.9.3 (instead of 1.9.2), the minimum supported is 1.9.1.
    • CNI version is 0.6.0.

    PS


    Read also in our blog:


    Also popular now: