
How I digitalized films, and not only
A bit of foreword
At first, even the thought was not to write an article, everything seemed ordinary and uninteresting. But to my surprise, doing New Year's weekend tidying up photo albums, I noted with interest that not only I decided to devote holiday time to this useful business. Related article “Experience in creating a catalog and indexing a family photo archive. Indexing and digitization of films " was on a habr. A little later, another article appeared “Metadata for organizing the storage of a photo archive” . Therefore, I decided to share some kind of experience, maybe bit by bit to whom it will come in handy.

In general, the idea to scan and organize old photographs, of course, has been hatched for a long time, it is not easy to decide on such a volume of work on scanning old photographic films (more than a hundred) and photographs (thousands). In general, since childhood, I wanted to have my digitized old photographs of my great-great-grandmothers and great-grandfathers, and finally, after 20 years, I decided to move on to this matter.
Scanner
The first thing the question was - naturally, a scanner. At one time, about 7 years ago, he tried to digitize the negatives and decided to stock up a film scanner. There wasn’t much money, I chose what was cheaper, it turned out to be Miktotek Filmscan 35 .

Compared with the monsters of scanning, it cost a penny, but the result was terrifying. I used Silverfast as the most advanced software for it at that time (maybe now). I don’t know why, but sometimes with different passages this miracle gave out either a blue or a green photo, then everything hung up, it was unpredictable and very sad, I had to pore over each frame for 10-15 minutes, straightening the histograms and performing other dances with tambourine. In general, this process has discouraged me from scanning films for several years, the scanner is lying around somewhere.
Now, considering the pros and cons, the following was decided.
There were a few points to consider:
- for the most part I’ll not be scanning, but my parents, since they have time now
- you need to scan not only films, but also photos
- you need to scan a lot
- no fabulous budget
In addition to all of the above, I understood that now the film is no longer the actual carrier, and therefore most likely it will be necessary to scan only once, though it may take a lot of time.
So, film scanners have disappeared for two reasons:
firstly, previous experience has shown that you can’t buy such a normal unit for cheap, but what’s cheap - oh, the second time I can’t stand it.
Secondly, buying a separate scanner for photos and separately for a film is also somehow expensive and impractical.
Moreover, I told myself, if something good comes across - I’ll take it to a professional laboratory, and you can go broke for a dozen frames.
Having looked at what is on sale from what can scan film in addition to paper, it turned out that the choice is small: either again sky-high prices, or just a couple of options. Break all the shops immediately after the holiday turned out that there are the following acceptable options:
- Epson Perfection V330 Photo (A4, 4800 x 9600 dpi, USB 2.0, CCD, Film Adapter)
- Epson Perfection V370, Photo (A4, 4800x9600 dpi, CCD, USB 2.0)
- Canon CanoScan LiDE 700F (A4 9600x9600dpi 48bit CIS Slide Adapter USB2.0)
- Canon CanoScan 5600F (A4 4800x9600dpi 48bit Slide USB2.0 adapter)
The rest was either too expensive, from 10,000, or, conversely, nothing skillfully. Unfortunately, the CanoScan 5600F fell out due to a lack of sales at the moment, although the description is very good. The rest were, according to reviews, about the same, but the decisive role was played by the fact that there were drivers for Linux for Epson, and since I would like to work not only under Windows, I won the Epson Perfection V330 Photo in the end. Nowhere could I find out how the 330 model differs from the 370, but since Linux drivers were mentioned only for 330, I settled on it, so to speak, "to avoid".

Drivers take on the AVASYS website .
Unfortunately, I have not had time to try Linux yet, but I liked the defect removal function in Windows software - it works with a bang on old black and white photos. But one must also be careful with her - sometimes she can count something worthwhile for a defect.
In reviews about the scanner, in some places, there is a problem with the appearance of stripes when scanning films - but I have not seen this yet. Nevertheless, in my opinion, here is something useful about this that was found in one of the reviews on the Yandex market: “After two years I can report on the outcome of the investigation: there is a calibration window in the frame of the scanner where the white balance is set. If dust particles get there, “broken pixels” are obtained, which, when the carriage is run, produce stripes. This is most likely a design flaw in the new LED backlight (but who will admit it ...). So gentlemen, if you have such a scanner,
remove the dust. ”
With what resolution to scan - this was not the last question. The scanner gives a maximum of 4800x9600, but when I tried to set this when scanning a 9x13cm photo, the system began to swear at the scale, I had to reduce it.
The criterion for choosing a resolution is simple: if you consider that you can print with a standard resolution of 300dpi, then to get the same image, you must have a minimum of 300dpi. Given that the photos are old, it makes no sense to overestimate this figure - anyway, physical resolution will not allow you to get quality out of nothing. Again, it is unlikely that someone will ever want to print a poster with the image of his great-grandfather in A1 or even A4 format. If someone writes a book, it is unlikely that there will be a picture more than a sheet. In general, I decided that a double excess would come down for very old ones, a triple excess for better ones and later ones, i.e. 600dpi and 900dpi respectively. Next, I chose what was closest from what the softin issued, which came with the scanner.
I decided to use the maximum for the negatives - I bought it with such a resolution for good reason ... Most likely it’s an overkill of 4800x4800dpi, but you can always trim it later, but the main thing is that you don’t have to scan with other parameters later and you can sleep peacefully.
Scans are saved, of course, in no case in jpeg, in order to avoid compression losses. Everything is just a tiff. It seems, of course, that the place eats more, but then just scan it - and then you don’t know the problems: what I want, I’m doing it. I didn’t come to this immediately either, but practice shows that if I save now, then I will regret and return to this issue, and if everything is to the maximum, then then there’s nothing to regret.
Cataloging
Naturally, after digitization, the whole thing must be sorted out somehow. The main task was to sign the great-great-relatives, because I wanted to keep the family history for the future, and without competent comments no one will ever figure it out.
The option to immediately process the photos and upload them to the site was not suitable for two reasons: firstly, you need to process everything at once, and this time, and parents don’t understand anything about this; secondly, technologies are changing, and who would know how a site will look like in a couple of decades, if at all it will somehow exist.
Using a smart cataloging program was not suitable for the same significant reason - there is no guarantee that in a few decades this software will be alive and accordingly no one will understand what, where and how is stored in its smart unique format.
It occurred to me to decide to store the description in a plain text file with the same name as the photo - it’s text and Africa text, for sure anyone can read it after decades, even if they come up with some kind of super unicode, nevertheless it is much more reliable than special software. But as a programmer, I looked with horror at this option - well, it's ugly and that's it. Yes, and inconvenient in the process.
Parents said that they generally want as if in Word - here is the photo, here is the signature - and everything is clear. From such an offer, the hair stood on end, for again - today there is a word - tomorrow it is not.
Another option is to store signatures in EXIF. It was embarrassing that when processing images, many EXIF softwares simply ignore it, as a result, losing precious signatures can be irreplaceable.
In general, after analyzing the whole situation, I made a decision: we scan the photo, sign it in the form of EXIF and then we make all these pictures with captions read-only, so that there is no temptation to change something, and thus guarantee the safety of information. I want to change - make a copy - and go. Well, of course, backups. And in general, in the end, we are programmers in order to outline a small script so that the entire EXIF can be exported just in case to a text file, “to avoid” :)
There are a lot of command line tools for working with EXIF in Linux, but this unacceptable for convenient work with a large number of pictures. However, here there are:
exif
, exiftool
, exiv2
, Googling, you can find more information. Next, I used exiftool
for batch processing, but more on that later.We look that is from GUI. Having studied what the OpenSource community is offering us, I settled on DigiKam - “digiKam is an advanced digital photo management application for Linux, Windows, and Mac-OSX”, as it is written on their website.
I decided to edit in GIMP , the GNU Image Manipulation Program, an analog of Photoshop, but opensource. Therefore, the ability to edit photos for cataloging software separately was not required, but several things were bribed in the cataloging itself.

Firstly, DigiKam edits EXIF, which is what I need.
Secondly, all the photos are immediately on the screen, we sign in the box next to it and immediately proceed to the next - quickly, simply and conveniently.
Thirdly, it was noticed that in EXIF itself there are several similar tags for commenting:Comment , UserComment , ImageComment , and so, DigiKam writes immediately to everything, so the likelihood that other software will read this information is quite large.
In addition, reading the reviews, I was pleased with the idea that, in addition to just EXIF, the software can maintain a catalog, without copying anything anywhere, unlike many others, but simply processing everything in place. This was a huge plus - I did not look for this opportunity initially, but it turned out to be impossible by the way. And what I liked - in addition to entering information in EXIF, she writes it to her database and then it’s convenient to sort and search for photos by tags, tags, descriptions, etc. And even if at some point the software disappears and the database also - then a copy of the data will remain in EXIF, which, in fact, is what I need.
Some interesting ideas on cataloging are described in the already mentioned article “Experience in creating a catalog and indexing a family photo archive. Indexing and digitization of film " . So, all or almost all of this data can also be stored in EXIF and, if necessary, exported to any format, as it will be convenient for us.
An additional plus of DigiKam is that you can choose any photo as the cover of the album, and I liked the idea of having the photo of the paper album as the cover, thanks to the author.
Another non-obvious point that I encountered when working with DigiKam: if there are no rights to write to the photo file, then the software silently writes only to its database, without making it clear that there are problems. For a long time I tried to figure out why there is a signature in the program, but not in the file, especially since the option “save to file” is set in the settings. So, keep this in mind - check access rights, otherwise you can swear for a long time.
Spread on the site
So, the main tasks are solved - scanning and cataloging. Now it is time to brag to relatives, to show familiar photos. Naturally by uploading photos to the site. Not so long ago, I was already doing softinka for this business: I put the necessary photos into the
catalog, launched it - and that's it, the album was done. I wrote about it on a habr last time, "Simple automation: a photo album" . Now, using DigiKam, I decided that you can mark a photo directly in EXIF tags, whether it should be placed in a photo album or not, since there were all sorts of images that should not be uploaded to the site. Yes, and comments can now be taken from EXIF.
Everything seems to be fine, but not very good.
Everything is processed in the site in PHP, and there, as it seemed to me, is a wonderful function for working with EXIF,
read_exif_data()
but, as practice has shown, this underfunction shows only part of the data, absolutely silent about the rest. I rummaged through everything I could - and the dream of an easy life had sunk into oblivion, I had to get EXIF out of the files at the stage of generating the album, since there are some command line tools. As a result, he rewrote the script, recalling the caustic commentary on his previous article “Php file generator in Perl ... Monsieur knows a lot ...”, laughed to himself that he was right, that he didn’t completely rely on PHP - that’s it Now a leg, and so a couple of minutes - and the problem is solved.
So, when processing photos in DigiKam, we mark the photo with a flag (it is called PickLabel there). The flag is written to the file in EXIF. When we process all the files from the directory, we pull out the checkbox using exiftool:
$flagPickLabel = `exiftool -b -PickLabel '$fname_in'`;
Well and further, depending on the flag - if it is - then we process it, if not - skip it. Everything is set on the command line, so that it is convenient. In fact, here you can process a lot of everything, it already tastes and color to whoever needs it.

Link to the sources, if suddenly someone needs to look carefully or even apply: photo_album-r143.tar.gz . How to use - mentioned in a previous article, I will not repeat.
Thank you for this, and if someone came in handy, I’m immensely happy.
Criticism is welcome.
UPD : I accidentally found it on the Habré about scanning negatives - I am surprised as I had not noticed before. Let it be there to the heap.