Once again about the electronic library for PocketBook

Published on May 14, 2012

Once again about the electronic library for PocketBook

I greet you, Habr!

This article is just a detailed commentary on the text “Electronic library for PocketBook: automatic processing” by dsd_corp , since I, being ethereal (readonly) in spirit, cannot (for now) leave ordinary comments. However, I will not discuss Habr’s policy, but I’ll get down to business.

Firstly, I would like to thank the author for his article. I learned a lot of useful things about the PocketBook 360 ° I used for a long time and brought the garbage can on it in relative order. The script, as the author promised, turned out to be portable (I have Linux), except for a slight inconvenience with encodings.

Below I will just give my comments.

  1. The directory where the books themselves are stored is best named so that it starts with a period. The device itself follows the ancient Unix tradition of "files from a point - for sharpening" and does not show them. That is, the script should be changed $storagenameto, for example '.zipstorage'.
  2. I found that the names of the generated files are written in cp1251 encoding (have they really not switched to Unicode in Windows yet?). It seemed to me very inconvenient, so I tried to edit the script to get kosher utf8. At first he did it naively

    sed -i 's/windows-1251/utf8/g' *php *.inc

    , etc., but it turned out to be shitty, since the script does quite non-trivial things, in particular, it understands what source encoding the input fb2 files are and correctly transcodes. In addition, I see text in PHP for almost the first time in my life. Therefore, I returned everything as it was, and simply did it

    convmv -f cp1251 -t utf8 -r --notest out_dir/dest/

    every time after the script was executed.
  3. The fact that the authors' links are laid out too deeply ('Letters AZ' / first letter / first three letters / author) seemed to me uncomfortable. At least when there are not very many books, this is not justified. I figured out how to do better, namely: balance the tree by limiting the number of elements at the top levels so that they fit on the same page (for example, this is 10 in my settings). Then the path will look like 'Ave-Arts / Aksakov', that is, it will turn out approximately as in naming volumes of the encyclopedia. Depending on the number of books, additional levels may be needed, of which there should be approximately the logarithm of the number of authors. It seems such a thing is scientifically called a prefix tree. I have not yet implemented this feature, but simply shortened the initial layout to 'first letter / author'.
  4. Sometimes (probably on broken fb2-files) a script may behave inappropriately - it eats up all the memory and starts to swap. I did not start debugging; it’s probably easier for me to rewrite the functionality in a more familiar language than to understand PHP. True, I’m not sure that I will get at least to this, especially since I overtook almost my entire library, and so far I have satisfied everything.
  5. Experimenting with how link files work, I did not immediately understand exactly how file systems are named from the point of view of the device. Namely, an external SD card is mounted in / mnt / ext2, and internal memory is mounted in / mnt / ext1. (When I experimented, I wrote directly to the internal memory, and the links with '/ mnt / ext2' did not work for me.) This naming is quite counterintuitive - ext2 causes an unambiguous association with the name of the file system (however, the map still has vfat). Therefore, when applying a script, you must remember that it prepares files specifically for the first section of the external card. Otherwise, in the end, you need to do something like this:

    find out_dir/dest/Библиотека -name '*.flk' -exec sed -i 's/ext2/ext1/' '{}' \;

    I also tried using relative rather than absolute flk links, but they don't seem to work at all. If anyone knows anything about relative links, let him share.
  6. As for copying efficiency. I did not experience any special problems (just a half gig), but there are some optimization considerations. Firstly, the idea of ​​zipping files seems dubious - we have a device with a very slow processor and relatively fast media. Therefore, the opening time of files may be degraded due to compression. However, tests are needed here. Secondly, to speed up the filling on the map, you can prepare the image of the partition and then later copy it to the map with the whole dd utility. I even almost remember how to create a partition image on a file - for this we need the losetup utility, which turns the file into a device on which you can run mkfs and mount.


UPD

How to copy an image


As a result of the work of the cataloguer, a cloud of small files appears, which are then copied for a long time to the file system of the flash drive. You can try to speed up this process by creating a copy of the flash drive’s partition image on the disk (or even on tmpfs, if it fits), and then copy the entire image. The benefit of this is not obvious, because as a result you have to copy bytes than if you copy by file. In addition, unnecessary flash drive amortization may increase. Nevertheless, I give a recipe suitable for any (not too ancient) Linux. (Commands starting with a sharp require root rights, the rest can be done as a regular user.)

  1. Firstly, you need to find out the exact size of the desired section of the flash drive. Let's say that the device is defined as / dev / sdc: Here we have a partition size of 4028400 blocks of 512 bytes each.

    # fdisk -l /dev/sdc
    ...
    Disk /dev/sdc: 4 GB, 4127195136 bytes
    64 heads, 32 sectors/track, 3936 cylinders
    Units = cylinders of 2048 * 512 = 1048576 bytes

    Device Boot Start End Blocks Id System
    /dev/sdc1 1 3934 4028400 83 Linux



  2. Next, you need to create a file on disk exactly the same size. Since the dd utility has the bs (block size) option, we don’t even need to multiply 4028400 by 512. However, in this case, the block size does not matter for dd, the main thing is that bs * count should be exactly the exact size in bytes.

    $ dd if=/dev/zero of=image bs=512 count=4028400
  3. Create a file system on the resulting file:
    $ /sbin/mkdosfs image
  4. We mount the image file as a device (for this we need the kernel module loop, which is in the kernels of all modern distributions by default), copy the files, unmount:
    $ mkdir mnt
    # mount -o loop image mnt
    # cp -r ...файлы... mnt
    # umount mnt
    $ rmdir mnt

  5. Now copy the image file to the location of the partition (it is important not to make a mistake with the device name):
    # dd if=image of=/dev/sdc1

Done! You can mount the USB flash drive and make sure that its contents are as expected.

It happens that there are broken blocks on the device, then everything gets complicated. In this case, you can copy the image of the partition to a file, instead of creating it again (points 1-3):

# dd if=/dev/sdc1 of=image conv=noerror

In principle, the file system stores meta-information about its bad blocks, so we can copy data to the image file without fear that they will get on the BB. However, I did not check this and I could be wrong.

Aerobatics is to store the current image of the flash drive on your home computer and have a utility that economically copies the difference between the new image and the old one on the flash drive .