Kubeflow: New Kubernetes Machine Learning Project

    Google developers have announced the launch of a new Kubeflow project. The project simplifies machine learning by providing the necessary tools for scaling and tuning the system in Kubernetes. In the article we will tell:

    • About Kubeflow components
    • how to get started with the solution;
    • about the prospects of the project.

    / photo Michael Hicks CC

    In 2017, two things happened. First: Kubernetes has established itself as a standard for working with a container cluster. This is confirmed by a 2017 Portworx survey in which 490 IT professionals from various industries participated: Kubernetes is used as a container orchestration tool more often than Docker Swarm, Amazon ECS, or Azure Container. The second - machine learning, according to Gartner, was at the peak of popularity.

    These two factors prompted Google to create Kubeflow - an open source project that simplifies working with iOS in Kubernetes and takes over all the advantages this orchestration tool: the ability to deploy on a variety of infrastructures (from laptops to production clusters), manage loosely-coupled micro services and scale on demand.

    Kubeflow Components

    The project code is stored in the Github repository . There you will find the following components:

    • JupyterHub is a server for creating and managing the Jupyter Notebook interactive environment . Using JupyterHub, you can share notebook files that let you store code, images, comments, formulas and diagrams together.
    • Tensorflow Custom Resource (CRD) , which can be configured to work with central or GPUs and adjusted to the cluster size.
    • Container for Tensorflow Serving , a flexible system for deploying machine learning models in a production environment. The component integrates with Tensorflow models “out of the box”, but is also suitable for other models and data.

    Container Solutions software developer Philip Winder notes that Kubeflow is a hybrid of JupyterHub and Tensorflow. In it, Tensorflow serves as a universal graph computing mechanism that allows programmers to abstract from iron and use one code to work with the CPU and GPU. That is why the same model can be deployed both on a laptop and in a cloud cluster.

    Getting started with Kubeflow

    For a quick start you will need:

    • ksonnet version 0.8.0 and later ;
    • Kubernetes version 1.8 (you can find a guide on how to configure it on our corporate blog ).

    To start working with Kubeflow, you need to run the following commands:

    # Задаем значение ksonnet APP
    ks init ${APP_NAME}
    cd ${APP_NAME}
    # Устанавливаем компоненты Kubeflow
    ks registry add kubeflow github.com/google/kubeflow/tree/master/kubeflow
    ks pkg install kubeflow/core
    ks pkg install kubeflow/tf-serving
    ks pkg install kubeflow/tf-job
    # Развертываем Kubeflow
    ks generate core kubeflow-core --name=kubeflow-core --namespace=${NAMESPACE}
    ks apply default -c kubeflow-core

    These commands configure JupyterHub and Custom Resource to work with training samples in TensorFlow. In addition, ksonnet packages provide prototypes for configuring TensorFlow tasks and deploying TensorFlow models.

    Detailed instructions for using Kubeflow can be found in the official manual . Here you can read the instructions from the developers, and here - try out Kubeflow in the browser right now.

    By the way, Michael Hausenblas (Michael Hausenblas), a developer from Red Hat and co-author of the book Kubernetes Cookbook , created a site to help those who work with machine learning in Kubernetes. There you can find an overview of basic tools and tutorials, including for Kubeflow.

    What's next

    The Kubeflow project has already been supported by many industry leaders: CaiCloud, Red Hat, Canonical, Weaveworks, Container Solutions and others.

    Developers David Aronchick and Jeremy Lewi, who work for Kubeflow at Google, say this is just the beginning. In the future, the team plans to attract more partners, popularize the idea and improve the project. You can follow the development of Kubeflow in the Slack channel by subscribing to the email newsletter and on Twitter .

    PS Three more articles from the First Corporate IaaS Blog:

    Also popular now: