Thoughts aloud or “what stops progress?”
I am not a very smart person. I have no achievements in the field of web design or in the field of marketing. Even my work is not very original. I am sure that most of you are talented people who have achieved much more in life than your humble servant. But for some reason, even I, a simple resident of this planet, sometimes see things as they could be.
“Why haven't they done it yet?” I ask myself. I have no answer ...
I’m not flattering myself, I think many of you have already thought, or thought about what I’ll say right now: the ideas are simple. But why no one has yet implemented it - it is not clear.
For example, I think many of you have seen a video in the category “The Future is Here,” in which an actor points a multi-function PDA at a building and receives information on this building, a 3d model, and more. Then I thought: these guys know how to look for ideas. Indeed, few people could come up with such a service. What does it really become very convenient and popular. I intentionally use the word "will", and not "may become." It is obvious.
But this service has not yet been implemented, although the implementation is not so complicated as it seems at first glance.
Let's look at what we need in order to translate this into reality.
First: we need the ability to recognize the building on the phone. This stage seems to be the most difficult, but given the current development of technology and common sense, we will refute this statement. The main thing is not to be afraid and think in stages.
Let's try to imagine the structure of what we saw as if it worked:
1) Reading data (photo, streaming video).
2) Sending data to the server for processing.
3) Actually processing.
4) Receive data from the server.
We all know that in this century of marketing tricks, almost everycamera has a mobile phone with the ability to access the Internetthe mobile phone has a camera, as well as access to the Internet. Many cities are covered by 3G, or even WiMAX. Almost everywhere there is access to the Internet via GPRS or EDGE. Combining these two lines, we can conclude that sending not a “heavy” photo to the server will turn out almost everywhere. And under certain conditions - you can send streaming video. Or instead of streaming video - a cycle of images.
Suppose the first and second points of our very simple plan are fulfilled: we can transfer information to the server. But why the photo server? I think the most ingenious understood this from the very beginning and I distract them with my trepidation, but tolerate a little.
All of you know such a giant in the IT industry as Google. Many of you choose its search engine as the main one. All of you are attracted by rich functionality and convenient search. As well as the continuous development of the project. The addition of new features and capabilities. And this is awesome.
So here. Google has one great Similar Images service . I think he is familiar to many of you, and you have even used it repeatedly. For the rest I’ll explain: this service is looking for similar images. You can follow the link and see the information on this service in more detail. Well, for now I will return to our sheep.
Let's move on to point three. Suppose we send photos, and then what? We can photograph the building from various positions and transfer the photos to the server, where each photo will be assigned a label, for example, “The Leaning Tower of Pisa”. Simple enough. Each building has a separate label. You ask: “What is the use of this? After all, taking pictures from all angles is impossible. ” And you will be right. Really impossible to photograph from all sides. But we already know that there are various image recognition mechanisms. Having a recognition engine and a large number of photos, we can recognize any photograph of a building taken with any device as a photograph of this building. The question remains of filling.
The question of filling the base of buildings in itself (let it be only buildings, for starters) seems very large-scale and complex. After all, it is necessary to hire a huge number of operators and fillers of the database, where information on objects will be stored.
Indeed, it is very difficult. And expensive. Especially if you forget about the fact that quite a large number of people live on our planet who are ready to help for free in this project if they see its prospects.
Tell me honestly, would you photograph a monument in your city, and your home? After all, it is absolutely not difficult and not expensive to take two or three photos and send them to the server with minimal comments. And if every tenth of billions of Internet users takes two or three photographs of any architectural monument, or just at home, we will have a fairly large collection of images with tags. Seems not bad.
Naturally, at the beginning the organizer will have to fill in a lot of especially large and popular buildings, in order to add enthusiasm to ordinary users, but then ordinary people will come into action! ..
The final scheme will look like this: a person directs the camera to the building and takes a photo (or a series of photos), the program sends them to the server for comparison, receives the result, sorts similar images by relevance and selects the most similar ones. Next, read the tags from the photo and search for information in the database, after which the data is sent back to the client. Everything is very simple. And very usable.
Of course, this is all a cost. This and the necessary software for mobile phones, with which you can call the camera function, get a picture, send a picture to the server and get an answer. This is the initial cost. This is the power of servers for processing information. But what a prospect! After all, this is a real, useful function, and not the exhaustion of the last juice from the finger of marketing people.
There are many prospects. For example, the promotion of a mobile OS in which this function is conveniently implemented.
But it’s very difficult to understand why this has not yet been implemented. After all, the idea is not complicated. Implementation - for some major player like Google - too. Maybe someone deliberately slows down to give it away when the rest of the tricks of the marketers are over? But if you stand aside - can someone realize earlier? What do you think?
PS I am not an editor or corrector. I am not even a journalist. So if you find any grammatical or stylistic errors in my text - be sure to notify me through the PM. Thanks in advance =)
“Why haven't they done it yet?” I ask myself. I have no answer ...
I’m not flattering myself, I think many of you have already thought, or thought about what I’ll say right now: the ideas are simple. But why no one has yet implemented it - it is not clear.
For example, I think many of you have seen a video in the category “The Future is Here,” in which an actor points a multi-function PDA at a building and receives information on this building, a 3d model, and more. Then I thought: these guys know how to look for ideas. Indeed, few people could come up with such a service. What does it really become very convenient and popular. I intentionally use the word "will", and not "may become." It is obvious.
But this service has not yet been implemented, although the implementation is not so complicated as it seems at first glance.
Let's look at what we need in order to translate this into reality.
First: we need the ability to recognize the building on the phone. This stage seems to be the most difficult, but given the current development of technology and common sense, we will refute this statement. The main thing is not to be afraid and think in stages.
Let's try to imagine the structure of what we saw as if it worked:
1) Reading data (photo, streaming video).
2) Sending data to the server for processing.
3) Actually processing.
4) Receive data from the server.
We all know that in this century of marketing tricks, almost every
Suppose the first and second points of our very simple plan are fulfilled: we can transfer information to the server. But why the photo server? I think the most ingenious understood this from the very beginning and I distract them with my trepidation, but tolerate a little.
All of you know such a giant in the IT industry as Google. Many of you choose its search engine as the main one. All of you are attracted by rich functionality and convenient search. As well as the continuous development of the project. The addition of new features and capabilities. And this is awesome.
So here. Google has one great Similar Images service . I think he is familiar to many of you, and you have even used it repeatedly. For the rest I’ll explain: this service is looking for similar images. You can follow the link and see the information on this service in more detail. Well, for now I will return to our sheep.
Let's move on to point three. Suppose we send photos, and then what? We can photograph the building from various positions and transfer the photos to the server, where each photo will be assigned a label, for example, “The Leaning Tower of Pisa”. Simple enough. Each building has a separate label. You ask: “What is the use of this? After all, taking pictures from all angles is impossible. ” And you will be right. Really impossible to photograph from all sides. But we already know that there are various image recognition mechanisms. Having a recognition engine and a large number of photos, we can recognize any photograph of a building taken with any device as a photograph of this building. The question remains of filling.
The question of filling the base of buildings in itself (let it be only buildings, for starters) seems very large-scale and complex. After all, it is necessary to hire a huge number of operators and fillers of the database, where information on objects will be stored.
Indeed, it is very difficult. And expensive. Especially if you forget about the fact that quite a large number of people live on our planet who are ready to help for free in this project if they see its prospects.
Tell me honestly, would you photograph a monument in your city, and your home? After all, it is absolutely not difficult and not expensive to take two or three photos and send them to the server with minimal comments. And if every tenth of billions of Internet users takes two or three photographs of any architectural monument, or just at home, we will have a fairly large collection of images with tags. Seems not bad.
Naturally, at the beginning the organizer will have to fill in a lot of especially large and popular buildings, in order to add enthusiasm to ordinary users, but then ordinary people will come into action! ..
The final scheme will look like this: a person directs the camera to the building and takes a photo (or a series of photos), the program sends them to the server for comparison, receives the result, sorts similar images by relevance and selects the most similar ones. Next, read the tags from the photo and search for information in the database, after which the data is sent back to the client. Everything is very simple. And very usable.
Of course, this is all a cost. This and the necessary software for mobile phones, with which you can call the camera function, get a picture, send a picture to the server and get an answer. This is the initial cost. This is the power of servers for processing information. But what a prospect! After all, this is a real, useful function, and not the exhaustion of the last juice from the finger of marketing people.
There are many prospects. For example, the promotion of a mobile OS in which this function is conveniently implemented.
But it’s very difficult to understand why this has not yet been implemented. After all, the idea is not complicated. Implementation - for some major player like Google - too. Maybe someone deliberately slows down to give it away when the rest of the tricks of the marketers are over? But if you stand aside - can someone realize earlier? What do you think?
PS I am not an editor or corrector. I am not even a journalist. So if you find any grammatical or stylistic errors in my text - be sure to notify me through the PM. Thanks in advance =)