Sound Indexing Appears on Google Labs Page
A new project has appeared on the promising page of Google Labs: GAudi (Google Audio Indexing) . This is a technology for recognizing and indexing English-language speech, which is extracted from multimedia files, including video.
Google de facto started the latest development two months ago on a small number of videos from the YouTube portal: see the news “Full-text video search has appeared on YouTube” . But it was a kind of “black box”: we just could see how the new feature works, but we did not know what really stood behind it. Now a separate interface for searching videos has been published (in the future, any video content from the Internet can be uploaded to this index), as well as a FAQ with information.
From the FAQ we learned that the speech recognition engine was created from scratch by a special working group of Google employees. Although research has been going on in this area for decades by many companies, GAudi is Google’s completely independent development.
At the moment, only English is supported and the system, of course, makes a lot of mistakes. For example, in this video the word “Czechoslovakia” is incorrectly recognized as “tech also but there”, and the word “free” is recognized as “forty”, and there are quite a few such errors.
On the project’s page it is reported that the speech recognition engine will be gradually “fed” not only by election clips, but also other thematic YouTube channels, and in the future, probably, it should index video content from other sites as well.
Google de facto started the latest development two months ago on a small number of videos from the YouTube portal: see the news “Full-text video search has appeared on YouTube” . But it was a kind of “black box”: we just could see how the new feature works, but we did not know what really stood behind it. Now a separate interface for searching videos has been published (in the future, any video content from the Internet can be uploaded to this index), as well as a FAQ with information.
From the FAQ we learned that the speech recognition engine was created from scratch by a special working group of Google employees. Although research has been going on in this area for decades by many companies, GAudi is Google’s completely independent development.
At the moment, only English is supported and the system, of course, makes a lot of mistakes. For example, in this video the word “Czechoslovakia” is incorrectly recognized as “tech also but there”, and the word “free” is recognized as “forty”, and there are quite a few such errors.
On the project’s page it is reported that the speech recognition engine will be gradually “fed” not only by election clips, but also other thematic YouTube channels, and in the future, probably, it should index video content from other sites as well.