How did we set the task to compare a hedgehog with a snake

Hi, Habr! In this article, we, consultants to the practice of analytics in the sales support department, will consider the importance of correctly assessing the quality of modeling in solving analytical problems. As part of our work, we often have to solve the problems of building predictive models based on customer data. At the same time, not only the description of the analytical task, but also the procedure for assessing the quality of the developed models can come from customers. And sometimes it happens that the customer offers to compare a hedgehog with a snake. Most often this can be encountered when the data are pre-divided into training and test samples, because the data collection for both samples may differ slightly.

This was the situation we had in one of the cases where the customer wanted to check the “strength” of the targeted communications.

Formulation of the problem

The bank conducted a one-time campaign in which it called around a part of its customers (~ 10 thousand customers) and offered to buy a certain loan product. At the end of the campaign, data on the response to communications was collected . The bank described to us not only the task itself that needs to be solved, but also indicated how and on which data the model should be built, as well as how to check the quality.

What was required of us:

Build a model to predict the response to communication.
To build a model to use data on customers who did not participate in the campaign. To do this, the bank gave us impersonal data for all customers, excluding from the sample those customers who participated in the one-time campaign.
As a target event when building a model, use the fact of filing an application for a credit product that was offered as part of the campaign.

The quality of the constructed model was supposed to be checked on the clients who participated in the campaign. Those. if the model predicts that the client is inclined to purchase a credit product and as a result of the communication this client has received a positive response, then it is considered that the model correctly predicted the response.

First concerns

Already at the stage of discussing the method of assessing quality, a concern was expressed about the incorrectness of this method of assessment. There are two reasons for incorrectness.

First, different target variables at the stage of building a model and at the stage of assessing its quality. A model of forecasting the fact of filing an application for a credit product without any communication is built, and the quality is checked by the results of applying the model to the task of predicting the response to communication .

Secondly, customers who participated in the campaign could be very different from all customers (as it is reasonable to assume that customers were selected for participation in the campaign according to some criteria).

Despite the concerns, we agreed to try to build a model with the current formulation of the problem. However, we requested part of the data with the results of dial-up by customers for use as an independent (test) sample.

Modeling

While waiting for some of the data with the results of dialing, they built a model on customers who did not participate in the campaign (~ 200 thousand customers, about 5% bought a loan product). Good results were obtained (Gini ~ 0.75 on the training, validation and test samples).

Later, we were unloaded data on the part of clients who participated in the campaign. The previously constructed model was applied to this data. When applying the model to this part of the sample, the results left much to be desired (Gini = 0.16).

Distributions

We started to deal with a sample of clients who participated in the campaign, and found that the distribution of data in many variables does not absolutely coincide with the distribution of data of clients who did not participate in the campaign.

Something like this distribution

NDA не позволяет оставить отметки на осях.

Hence the explanation for poor results. We tried to build a model on the part of clients who participated in the campaign (about 5 thousand. - Response = 8%). The result is bad (not enough data - poor quality indicators - Gini ~ 0.3).

Problems

As a result, several assumptions of a modest simulation result were put forward:

Different target variables (remember that we are learning to determine the propensity to buy a credit product, and we predict a response to communication ).
The sample of clients participating in the campaign was not randomly formed, which is why the distribution of predictors in it may differ from the distribution in the general population of all bank customers.
- in the sample of clients who did not participate in the campaign, there are clients who cannot apply for a loan
- clients participating in the campaign have practically no credit products: only 2% have records in the history of loan payments, as opposed to 19% of clients who did not participate in the campaign.
Not enough data on the results of the campaign in order to use them to build a model.

Problem solving

It is always necessary at the very beginning to determine the correct criteria for evaluating the result.
- Target variables must be the same.
- The data on which they offer to learn, and on which they offer to test the result, must be from one general population.
It is necessary to discuss the scope of the project in advance (and that they apply to training and test samples).
Lack of data - either changing the task (so that was enough), or waiting for new communications.

Total

The above arguments were presented to colleagues from the bank and the task was decided to be redone.

In the new formulation of the problem, we were required to predict the response to the regular campaign. However, this time we had data on communications for the same campaign earlier. The result was a successful project (it was possible to increase the response by more than 2 times).

findings

As a result, we return to the basics of modeling:

It is always necessary to understand whether what we model is the same as what the customer wants from us. In this case, in order to predict the response to communications, it was necessary to have data on communications.
Data must be from the same population. If the model is trained on certain regularities, and in a test sample it encounters other regularities, there is little chance of getting a good quality indicator on a test sample.

Tags: