How to implement machine learning technology in your business

    According to Gartner, machine learning is at its peak. Being engaged in the development and implementation of solutions in the field of data analysis and machine learning, our team DATA4 has gained experience in key stages and pitfalls, which I will share in the article.

    Consider the stages of implementation:

    1. Statement of the problem

    Any technology must solve specific business problems. To describe all the applications of machine learning will require a separate article, but there are several main areas. These are predictive analytics (scoring, outflow, determining the best offer, related products, etc.), text analysis (reviews on the Internet, moderation of content, topic of appeals, etc.), speech analytics and video analytics.

    For successful implementation, it is necessary to determine which KPI business we are improving, how and by what metric we measure the result.

    2. Collection, storage and preprocessing of data

    When the task is set, it is necessary to create a training sample (unfortunately, most business problems are solved by “learning with the teacher”). In our experience, sampling is the longest stage. To reduce it, a company must have a culture of working with data.

    In addition to data collection, it is necessary to clear them and identify the features that affect the final result.

    3. Learning Algorithm

    The development of the algorithmic part of the most interesting, but also the fastest stage. It usually takes from several hours to several weeks of work.

    4. Development of high-level strapping

    The solution should be clear not only to the expert in data analysis, but also to the programmer or administrator who will implement this solution. And if this is a highly loaded solution, or a solution with increased security requirements, you may have to rewrite it from Python to another language.

    5. Integration

    As a rule, it takes a lot of time due to the need for additional communications and approvals. This stage is best performed by the internal forces of the customer team.

    6. Collect feedback, adjust model

    The world is constantly changing, not all features can be considered at the beginning of development. Collecting feedback helps to retrain models in a timely manner. Ideally, at this stage, the cycle is restarted, but with less time.

    Features of solutions based on machine learning:

    1. Machine learning is based on statistics, and when the algorithm gives a wrong prediction, this is normal. It is better to immediately explain to the business customer, according to which metrics quality is assessed, what these metrics mean (not everyone knows what the F measure is and Roc-Auc), and that you can set 3 examples and see the result, it's interesting, but not statistically significant .
    2. Bad predicted result. The data does not always contain a useful signal, and it is not possible to accurately predict the result in advance. We usually take the data, build simple models, and based on them we say what result it is possible to achieve. This problem does not apply to some classic tasks (recognition of faces, speech, etc.).
    3. Machine learning is the “last mile” technology, not a silver bullet from all problems. If the sellers do not take the phone from the customer and do not call back the customers, then there will be very little sense from the introduction of speech analytics.
    4. The main time is spent on integration, and data collection and processing, and not on learning the algorithm (with rare exceptions).

    Options for working with third-party developers:

    1. Payment by the hour. Only suitable for rapid prototyping and MVP. But not suitable for solutions that require further support.
    2. Contract development. Intellectual property is transferred to the customer, support is possible, but it is necessary to prescribe TK carefully.
    3. Payment from proven effectiveness. From personal experience in DATA4, a case that is too complicated from the point of view of coordination, which is practically not used in practice.

    Alternatively, you can use ready-made platforms IBM, Microsoft, etc., but in practice it is expensive with constant use, it is not always possible to implement a specific case using ready-made tools, and there are restrictions on what data can be sent there.


    Machine learning technologies increase business efficiency, but one must remember that in order to make a complete solution, it is not enough just to train the algorithm, but it is necessary to prepare the data and integrate the solution with internal systems. And be prepared that the result will depend on the quality of the training sample.

    Also popular now: