Yahoo! Placemaker - geoprocessing in action


    Most recently - May 20, Yahoo! has announced the release of its new product, which currently has no analogues - Yahoo! Placemaker , a geo-parsing service whose main purpose is to extract geo-relevant information from documents of various formats. The presentation took place, as they say, “with a large crowd of people” and ended with “applause turning into a standing ovation” - the idea and implementation definitely liked it, and for some time on Twitter one could observe quite a positive stream of comments on this topic. But the presentation is over, everyone goes home, and begins to ask questions: so what is it?

    What is Placemaker


    As already mentioned, Placemaker is a web service that allows you to extract geo-relevant information from some document. There are several points here that need clarification. First of all, what are these documents from which information can be extracted? And secondly, what kind of geo-relevant information is this, and most importantly - who needs it and why?

    With documents, everything is quite simple. At the moment, Placemaker supports text (well, it is Africa text), HTML documents (although it’s not so simple here), as well as XML-derived news aggregation formats - RSS and Atom. In addition, Placemaker "understands" the RSS and Atom extension, known as GeoRSS , and is also able to extract additional information from microformatsthat are embedded in an HTML document. We should talk in more detail about the information.

    What placemaker can do


    If very briefly, then all the features of PlaceMaker can be defined in three groups, namely:
    • What places (geographical) are mentioned in the document and what is their importance?
    • Among all the places with the same name, which is specifically mentioned in the document?
    • What place is the whole document in question?



    We illustrate these two concepts with an example. Take a typical news article . Without even reading it, one can distinguish, firstly, that this article is about Pakistan and “about something like that,” and secondly, by quickly scanning the text, specific geographical names can be distinguished - Pakistan, Islamabad, USA (well where without them!), and several others. This is precisely the essence of Placemaker - to say “what” this article is about in a geographical sense and to list geographical names, sorting them out, if necessary, by importance.

    In addition, it should be noted that the problem of places with the same name is almost the most difficult that the developers decided when developing Placemaker. For example, did you know that there are 11 places called Islamabad? Or 23 London? Or 47 York? And also 29 places with the name "Moscow", 8 - with the name "Samara" and 234 San Jose? Of course, some of them are more popular, and some of them are less popular, but you need to choose not the popular one, but the right one!

    How to use Placemaker


    The question of the specific use remains at the discretion of the user of the service - but there can be a lot of options, starting from the geographical categorization of news articles and searching for references to this particular Springfield in a large set of documents, ending with quite esoteric ones like plotting on the map the intensity of Twitter messages about a specific place. And this is just a text - if we look at RSS / Atom as well, the task that can be solved with the help of Placemaker in almost one action is turning RSS into GeoRSS, that is, again, adding geographic information to the stream. For example, one of the groups in Yahoo! made an excellent demo application that collected RSS feeds from many sources around the world, "passed" them through PlaceMaker, and then showed it on a map,

    reference Information


    Some facts about Placemaker.
    • Placemaker uses WOEIDs (Where On Earth IDs) to indicate locations. The use of such an identifier allows us to clearly say what place we are talking about (in contrast to the name and coordinates). A full set of WOEID is currently available for download by developers (and will be updated).
    • Placemaker uses the same platform as Yahoo! Geoplanet , with which, by the way, you can play around here - and FireEagle.
    • Placemaker - a web service that is able to receive requests in POST and REST format, and return the results either in the form of XML, a predefined scheme, or in the GeoRSS format
    • In order to use Placemaker, the developer just needs to get the Application ID on Yahoo! Developer network
    • Placemaker is free, as is GeoPlanet Data - a database of WOEIDs and related information
    • Placemaker supports 27 languages, among which - to my great regret - there is still no Russian (however, I do not leave hope that the situation will change).
    • In order to play with Placemaker, you can use the minimalistic demo service that Rasmus Lerdorf (author of PHP) wrote, in my opinion, overnight - since no more than half a day passed from the moment the public API was announced until the service appeared. In addition, the forum will publish links to new products that use Placemaker (for obvious reasons, there are very few of them :))
    • It’s best to start learning about Placemaker from the official page on YDN , where you can read the user manual (which is highly recommended for study) and ask questions in a forum that will be monitored by developers.

    And finally


    Placemaker is beta. And it’s not “eternal beta”, as is often the case with other companies, but beta until developers catch and fix it - no, not all, but so many bugs that the prefix “beta” can be removed. Therefore, if you think you have found a bug - do not keep it to yourself! :) Write to the developers in the forum - they will thank you and fix it as soon as they can. The same applies to functionality, which is insanely needed, but for some strange reason is not included in this release - write, and you will have everything!

    Update: And I’ll add a few links to demos and mashups that use PlaceMaker:
    The list will be expanded!

    Also popular now: