Neural networks have learned to judge a book by its cover.

    The persistent expression “do not judge a book by its cover” warns against judging anything or anyone by appearance alone. But when the reader sees the book, it still happens: familiarity usually begins with the cover. It is she who leaves the first impression of the content and begins to draw a story in the mind of man. Good covers are simply created to be judged by them.

    People do an excellent job with the definition of the genre, just looking at the visual design of the book. Agree to choose a cookbook, biography or guide, just looking at the cover is pretty easy. Then an interesting question arises: can artificial intelligence judge a book by its cover as successfully as a person?

    Scientists from Kyushu University in Japan tried to get an answer . They set in front of the convolutional neural network ( CNN) the task of studying the book covers and determining the category to which they belong. The teaching method turned out to be quite simple: the researchers downloaded more than 13.5 thousand covers from Amazon.com along with the title, author's name and genre of the book. In addition to determining the category, this set of data may later be useful for teaching neural networks to recognize and analyze fonts and to solve other problems related to design. In their experiment, scientists used only genres, discarding all other data from the set. Neural network understood 20 possible genres. If the book was repeated in several categories at once, the scientists simply pointed out the very first one.



    The research team then used 80% of the data set to train the neural network to recognize the genre depending on the cover image. The neural network that they used in their experiment consisted of four layers, each with 512 neurons. Together they learned to determine the correlation between the cover design and the genre. Another 10% of the data set went to check the network. At the final stage, the remaining 10% was used to determine how well the network can classify unfamiliar images.

    The result was quite interesting. The algorithm correctly identified the three most common genres in 40% cases. With all the other genres, the accuracy was about 20%. This is much better than just an accident. The relatively correct operation of the neural network shows that the classification of books by cover is a real, albeit difficult task.

    Some genres have proven easier to recognize than others. For example, travel books or books about computers and technologies are relatively easy to define, since designers usually use similar images for the cover. In addition, scientists have found that the neural network easily learned cooking books, if their design uses food photos.



    However, the neural network began to doubt if it was enough to appear on the cover of a photograph of a cook or other objects indirectly related to the preparation of dishes.

    Biographies and memoirs also caused difficulties in the neural network: very often such books were sent to the category of historical. Interestingly, for many of these books, the secondary genre on Amazon.com was precisely history. Therefore, it cannot be said that the algorithm was 100% wrong.



    CNN confused children's books with comics and graphic novels, as well as medical books with math textbooks. This is not surprising, given the specific similarities between these categories. The network was mistaken and with different in essence, but similar in design books on law and religion. Usually their covers are made either in the same color without any drawings, or with abstract images.

    In the work presented by Japanese scientists, there is one major drawback. They did not compare the performance of their neural network with a person's ability to determine genres by covers. It would have been an interesting experiment that would be easy to organize using crowdsourced online platforms. And until this experiment is carried out, we will not know whether artificial intelligence does the task better than a person. But despite this unfortunate omission, no matter how well we can define genres by the cover, the machines will one day be able to do it faster. It is only a matter of time.

    However, the result of this study deserves attention. He can help designers improve their skills when it comes to book covers. You can go even further and teach the technician to design covers without human intervention. In the future, this may mean that creating a cover design by a person is another task that will go to the archives of history.

    Graphic design has become an object for machine learning relatively recently. The most famous experience of the practical use of neural networks is connected, first of all, with the recognition of the artistic style of famous authors of paintings and its further transferon other images. Researchers from Kyushu University pursued a similar goal, but went a little further: they tried to reveal the hidden meaning behind the style. If we talk about classification, there have already been attempts to teach neural networks to sort music , pictures , texts by genres.

    Scientific work published on arXiv.org ( ArXiv: 1610.09204 [cs.CV])

    Also popular now: