alizar August 30, 2014 at 11:42

Internet Archive uploads over 14 million free historical images to Flickr

One of the employees of Internet Archive has developed a program for automatically extracting illustrations from millions of books in the OCR-scanning process, which is now carried out by Internet Archive. Kalev Leetaru used the existing text recognition module: he first defines the boundaries of the illustrations to discard the illustrations before OCR. But why should the material disappear?

All extracted illustrations were aligned, cropped, cleaned and uploaded to Flickr photo hosting with the accompanying text from the book. Thus, a full-text search is possible in the archive of illustrations of the Internet Archive Book Images , which are in the public domain.

A total of 14 million images are uploaded to Flickr (2.6 million are currently uploaded).

The gallery of pictures from old books is a very exciting thing. Here you can find landscapes, and illustrations on culinary affairs, and notes, and pictures from medical guides, and old maps. The catalog of illustrations invites you to a kind of “time travel”: enter some term (phone, plane) - and you will see how this thing looked before.

Many pictures are some strange obscure objects from the past. Without a description, you won’t understand what it is.

Surely the Wikipedia editors will find suitable illustrative material to replenish many historical articles.

For each illustration, the name of the book, the year of its publication, and the page on which the illustration met are indicated. There is a link to read the book online (they are all published on the Internet Archive website). As part of this project, 600 million pages have already been digitized.

Anyone can do anything with these images, including non-commercial or commercial use, republishing, editing, etc.

Tags:

Internet Archive uploads over 14 million free historical images to Flickr

Also popular now: