White noise draws a black square
Any analyst, at the beginning of his work, goes through the hateful stage of determining the identification of distribution parameters. Then, with the accumulation of experience, for him, the coordination of the residual scatter obtained means that some stage, in the analysis of Big Data, is passed and you can move on. There is no longer any need to check hundreds of models for compliance with various regression equations, to search for segments with transients, to compose a composition of models. To torment yourself with doubts: "Maybe there is some other model that is more suitable?"
I thought: “But what if you go from the opposite. See what white noise can do. Can white noise create something that our attention compares with a significant object from our experience? ”

Fig. White noise (file taken from the network, size 448x235).
On this issue, he reasoned as follows:
Further in the text, I will explain how these tasks related to Big Data analysis.
In the book by G.Sekey “Paradoxes in Probability Theory and Mathematical Statistics” ( p. 43 ), I found a link to the Erd --s - Renyi theorem , which sounds like this:
When throwing a coin n times, a series of emblems of length
observed with a probability tending to 1, with n tending to infinity.
For our drawing, this means that in each of the 235 lines with a probability tending to 1, there is:

that is, we drop to the whole - 8 black points in a row horizontally.
And for all 448 columns, with a probability tending to 1, there is:

discarding up to the whole - 7 black points in a row, vertically.
From here we get the probability that in the “white noise” a black rectangle of 8x7 pixels in size will be made up for this picture:

Where 1 is the first sequence of black dots in a line, anywhere in two-dimensional space.
I do not argue that the probability is very small, but not zero.
Moving on, we can combine all the lines into one and get a line with a length of 102,225 characters. And then, by the Erdos-Renyi theorem, with probability tending to 1, there is a chain of length:

And for a chain of 1 million records:

As you can see, the connection between the Erdos-Renyi theorem and Big Data was uniquely identified.
Note. Next I will state my own analysis of the identified. Since in that form, this theorem and its proof, which is presented in the book of G.Sekey, I could not find.
We get that the Erdos-Renyi theorem can be used by the test, by definition of data homogeneity.
It is applicable to distributions having a central moment of the first order (MX).
It can only be applied to single-channel sequential random processes.
Any distribution, with expectation, we can imagine as a deviation from the center: left-right, up-down. That is, the loss: tails eagle.
Accordingly, by this theorem, an interval should be detected in which consecutive values, in the amount of
are above or below MX (Y (xi)).
Note. In this aspect, I wanted to see the proof of this theorem, to understand there is only one such row (only above or below) or two (above and below). According to my thoughts, the symmetry of these phenomena should give rise to two contracts and, on the other hand, analyzing the proof of a similar process, these mathematicians related to graphs, then suggested that they built the proof on determining the maximum. Which allows the existence of evidence on minimizing the objective function. Questions arose about how the Erdрдs-Renyi theorem looks for asymmetric probabilities, for options more than 2.
The practical consequence of the discovery of only one such sequential contract in the base under study gives us the opportunity to assume that all the data presented are homogeneous.
The second one. If, by processing the data, according to the Erd -s-Renyi theorem, we found that there is a series of more values than it should be, then the situation shown in the figure is likely.

The series shown in the figure is composed as a composition of two functions, for the purposes of the example.
The third conclusion. If, processing the data (1 million records), by the Erdшаs-Renyi theorem, not a single row with a length of 19 numbers was found, but, say, three sequences with 17 numbers were found. It can be assumed that the general data consist of a composition of three functions, and by the place of these series, to determine the intervals in which transients may occur.
When he worked on this material, an observation was made about the following. All developed methods of data analysis are made for technologies when, according to small natural observations, it is necessary to determine the parameters of a much larger population, from 100 observations, to determine the properties of the general population of 1 million or more. And for modern tasks, when it is necessary to decompose a huge database, the tools developed by statistics are very laborious.
Continuation: Part 2 , Part 3 .
I thought: “But what if you go from the opposite. See what white noise can do. Can white noise create something that our attention compares with a significant object from our experience? ”

Fig. White noise (file taken from the network, size 448x235).
On this issue, he reasoned as follows:
- What is the probability that horizontal and vertical lines of noticeable length will appear?
- If they can appear, then what is the probability that they will coincide with their origin in one of the coordinates and make up a rectangular figure?
Further in the text, I will explain how these tasks related to Big Data analysis.
In the book by G.Sekey “Paradoxes in Probability Theory and Mathematical Statistics” ( p. 43 ), I found a link to the Erd --s - Renyi theorem , which sounds like this:
When throwing a coin n times, a series of emblems of length
For our drawing, this means that in each of the 235 lines with a probability tending to 1, there is:

that is, we drop to the whole - 8 black points in a row horizontally.
And for all 448 columns, with a probability tending to 1, there is:

discarding up to the whole - 7 black points in a row, vertically.
From here we get the probability that in the “white noise” a black rectangle of 8x7 pixels in size will be made up for this picture:

Where 1 is the first sequence of black dots in a line, anywhere in two-dimensional space.
I do not argue that the probability is very small, but not zero.
Moving on, we can combine all the lines into one and get a line with a length of 102,225 characters. And then, by the Erdos-Renyi theorem, with probability tending to 1, there is a chain of length:

And for a chain of 1 million records:

As you can see, the connection between the Erdos-Renyi theorem and Big Data was uniquely identified.
Note. Next I will state my own analysis of the identified. Since in that form, this theorem and its proof, which is presented in the book of G.Sekey, I could not find.
We get that the Erdos-Renyi theorem can be used by the test, by definition of data homogeneity.
It is applicable to distributions having a central moment of the first order (MX).
It can only be applied to single-channel sequential random processes.
How to apply it
Any distribution, with expectation, we can imagine as a deviation from the center: left-right, up-down. That is, the loss: tails eagle.
Accordingly, by this theorem, an interval should be detected in which consecutive values, in the amount of
Note. In this aspect, I wanted to see the proof of this theorem, to understand there is only one such row (only above or below) or two (above and below). According to my thoughts, the symmetry of these phenomena should give rise to two contracts and, on the other hand, analyzing the proof of a similar process, these mathematicians related to graphs, then suggested that they built the proof on determining the maximum. Which allows the existence of evidence on minimizing the objective function. Questions arose about how the Erdрдs-Renyi theorem looks for asymmetric probabilities, for options more than 2.
The practical consequence of the discovery of only one such sequential contract in the base under study gives us the opportunity to assume that all the data presented are homogeneous.
The second one. If, by processing the data, according to the Erd -s-Renyi theorem, we found that there is a series of more values than it should be, then the situation shown in the figure is likely.

The series shown in the figure is composed as a composition of two functions, for the purposes of the example.
The third conclusion. If, processing the data (1 million records), by the Erdшаs-Renyi theorem, not a single row with a length of 19 numbers was found, but, say, three sequences with 17 numbers were found. It can be assumed that the general data consist of a composition of three functions, and by the place of these series, to determine the intervals in which transients may occur.
When he worked on this material, an observation was made about the following. All developed methods of data analysis are made for technologies when, according to small natural observations, it is necessary to determine the parameters of a much larger population, from 100 observations, to determine the properties of the general population of 1 million or more. And for modern tasks, when it is necessary to decompose a huge database, the tools developed by statistics are very laborious.
Continuation: Part 2 , Part 3 .