"Mom, he counted me!", Or Where do the legs grow from the analysis of pedestrian traffic
Recently, there was news that the US launched a program for collecting data on pedestrians with billboards in order to determine their target audience. A discussion immediately began about the ethics of collecting GSM data and about which private data would leak. But I want to completely not in the mainstream of this discussion tell a little about the history of the issue and how to use the information gathered in one way or another about the street crowd for the benefit of the city and its inhabitants.
Let's start from the very beginning. Methods for the analysis of pedestrian traffic appeared in the last century, and instead of the GSM towers data, they used a free workforce of students who, as part of their term papers, wrote and scribbled thousands of sheets in a box, a strip, and A4. Only here is the problem, all people are so different, and all are engaged in the street with their private obscure affairs. Such disparate information is inconvenient to process, and where to catch significant patterns, and will they be the same in different cases?
How to unify data on the unorganized mass of people? The easiest option is to discard the very “private data” that is so dear to their owners. For example, you can use the methodology of the Soviet scientist A. V. Krasheninnikov: focus on the density of people in the territory (horizontal axis) and the intensity of their movement (vertical axis). This will result in such a “diagram of environmental behavior”, on which there will be a place for any kind of human activity:
These four small squares, for example, have no relation to each other. In a real situation, more complex and recognizable pictures are obtained:
Surprisingly, even with only two parameters, you can make a “portrait” of the place, understand what people want from it, whether their environment satisfies, and what is worth changing. The shape, size of space, obstacles to visibility or passage affect the graphics very strongly: different places attract different people and stimulate different activities. If possible, such sketches are used in the reconstruction of old quarters, and for the design of new ones - control spatial patterns developed on their basis.
At the current level of information processing, you can easily add 2 more parameters: gender and age. We encode the floor with the shape of the mark on the diagram, and age with color. A portrait of an ordinary courtyard might look something like this:
Most likely, the result of “surveillance” by billboards will be approximately such “portraits” of the urban environment around them. A kind of quickly readable code, "nothing personal." Easy to process and find patterns. And - unlike the ubiquitous QR codes - this one can be read with the naked eye.
If you add a fifth parameter - time - you can track that at different hours of the day, days of the week, seasons, the same places attract different people. Such a promising field of science as rhythm analysis deals with this. There are examples of the practical use of rhythm analysis, for example, California Santa Cruz introduced a program, which makes up the route for patrol cars, based on crime statistics on the streets taking into account the days of the week, time of day, football matches on TV, etc. There are examples of this on cellular data . Anyway, crime forecasting is now a very popular area.
Santa Cruz Crime Forecast Map.
It is interesting that initially one of the applications of Krasheninnikov’s methodology was precisely the improvement of the criminal situation in residential areas: the algorithm he developed made it possible to find places that were attractive for asocial citizens and “transcode” them. However, at a time when computers were large and programs were small, the program he proposed was never written. It’s quite possible to analyze it manually, but it’s long and boring. In addition, the data that can be obtained from the map is often not enough: people really like to “rotate” uncomfortable spaces, using them for other purposes, but to find out, you need long-term direct observation. But still not mock poor students? At the current stage of technology development, it became possible to automate this part of the work, for example,
A manually constructed map of the zones of social control of one large quarter on a 25x25 m grid. Beauty, you can hang on the wall. Classic abstractionism Pete Mondrian approves.
If it is possible to operate with GSM tower data, it is possible to track not static “slices”, but dynamic “tracks”, and the results become more interesting. For example, in an IBM Research laboratory in Dublin, they wrote an algorithm for analyzing the traffic of people using public transport, which was run around in the example of the 4.5-million-strong city of Abidjan and allowed to improve the transport situation, reducing the waiting time and travel by an average of 10% for all residents. Information from December 2011 to April 2012 was collected and provided for scientific research by the operator Orange. The database includes 2.5 billion records and is cleared of any personal information.
In the upper figure - waiting time at stops, in the bottom - congestion routes.
In general, the devil is not so terrible as he is painted: significant positive results of total surveillance really exist. Of course, advertising companies are now claiming the role of world evil and are already approaching the critical level described by E. Griffith in his “Listen, Listen,” G. Kuttner in the book “The Day Does Not Count,” F. Paul in “Merchants of Venus” or R. Russell in the “Room”. Of course geomarketing it’s not going anywhere in our lives, but there is a likelihood that the data collected by billboards will become public domain, as is the case with Abidjan, and then everyone who comes up with a new analysis algorithm can experience it, learn something new about the face cities - and even change his expression for the better.
Let's start from the very beginning. Methods for the analysis of pedestrian traffic appeared in the last century, and instead of the GSM towers data, they used a free workforce of students who, as part of their term papers, wrote and scribbled thousands of sheets in a box, a strip, and A4. Only here is the problem, all people are so different, and all are engaged in the street with their private obscure affairs. Such disparate information is inconvenient to process, and where to catch significant patterns, and will they be the same in different cases?
How to unify data on the unorganized mass of people? The easiest option is to discard the very “private data” that is so dear to their owners. For example, you can use the methodology of the Soviet scientist A. V. Krasheninnikov: focus on the density of people in the territory (horizontal axis) and the intensity of their movement (vertical axis). This will result in such a “diagram of environmental behavior”, on which there will be a place for any kind of human activity:
These four small squares, for example, have no relation to each other. In a real situation, more complex and recognizable pictures are obtained:
Surprisingly, even with only two parameters, you can make a “portrait” of the place, understand what people want from it, whether their environment satisfies, and what is worth changing. The shape, size of space, obstacles to visibility or passage affect the graphics very strongly: different places attract different people and stimulate different activities. If possible, such sketches are used in the reconstruction of old quarters, and for the design of new ones - control spatial patterns developed on their basis.
At the current level of information processing, you can easily add 2 more parameters: gender and age. We encode the floor with the shape of the mark on the diagram, and age with color. A portrait of an ordinary courtyard might look something like this:
Most likely, the result of “surveillance” by billboards will be approximately such “portraits” of the urban environment around them. A kind of quickly readable code, "nothing personal." Easy to process and find patterns. And - unlike the ubiquitous QR codes - this one can be read with the naked eye.
If you add a fifth parameter - time - you can track that at different hours of the day, days of the week, seasons, the same places attract different people. Such a promising field of science as rhythm analysis deals with this. There are examples of the practical use of rhythm analysis, for example, California Santa Cruz introduced a program, which makes up the route for patrol cars, based on crime statistics on the streets taking into account the days of the week, time of day, football matches on TV, etc. There are examples of this on cellular data . Anyway, crime forecasting is now a very popular area.
Santa Cruz Crime Forecast Map.
It is interesting that initially one of the applications of Krasheninnikov’s methodology was precisely the improvement of the criminal situation in residential areas: the algorithm he developed made it possible to find places that were attractive for asocial citizens and “transcode” them. However, at a time when computers were large and programs were small, the program he proposed was never written. It’s quite possible to analyze it manually, but it’s long and boring. In addition, the data that can be obtained from the map is often not enough: people really like to “rotate” uncomfortable spaces, using them for other purposes, but to find out, you need long-term direct observation. But still not mock poor students? At the current stage of technology development, it became possible to automate this part of the work, for example,
A manually constructed map of the zones of social control of one large quarter on a 25x25 m grid. Beauty, you can hang on the wall. Classic abstractionism Pete Mondrian approves.
If it is possible to operate with GSM tower data, it is possible to track not static “slices”, but dynamic “tracks”, and the results become more interesting. For example, in an IBM Research laboratory in Dublin, they wrote an algorithm for analyzing the traffic of people using public transport, which was run around in the example of the 4.5-million-strong city of Abidjan and allowed to improve the transport situation, reducing the waiting time and travel by an average of 10% for all residents. Information from December 2011 to April 2012 was collected and provided for scientific research by the operator Orange. The database includes 2.5 billion records and is cleared of any personal information.
In the upper figure - waiting time at stops, in the bottom - congestion routes.
In general, the devil is not so terrible as he is painted: significant positive results of total surveillance really exist. Of course, advertising companies are now claiming the role of world evil and are already approaching the critical level described by E. Griffith in his “Listen, Listen,” G. Kuttner in the book “The Day Does Not Count,” F. Paul in “Merchants of Venus” or R. Russell in the “Room”. Of course geomarketing it’s not going anywhere in our lives, but there is a likelihood that the data collected by billboards will become public domain, as is the case with Abidjan, and then everyone who comes up with a new analysis algorithm can experience it, learn something new about the face cities - and even change his expression for the better.