Custom Python cover server for internet radio

    image

    I am a perfectionist who loves everything in order. Most of all I am happy when things work exactly the way they should work (in my understanding, of course). And I have long had my own personal Internet radio based on IceCast-KH + LiquidSoap. And for many years I was not allowed to sleep peacefully by the fact that streaming broadcast servers do not know how to render artwork for playing tracks in a stream. And not only in the stream - they don’t know how to do it at all. I switched to IceCast-KH (fork from IceCast2) only because of one of its uber-features - it can give mp3 tags inside the flv stream (this is necessary to display the executable track when playing online on the site via a flash player). And now it's time to close the last question - the return of the covers of the tracks being played - and calm down. Since there were no ready-made solutions, I didn’t come up with anything better, than write your own cover server for .mp3 files. How? Welcome to cat.

    Background


    I usually listen to the radio in the car, on a 2-din radio tape recorder based on Android 4.4 KitKat (and at home on a tablet under the same Android). For listening, after a long and thoughtful search of existing programs, XiiaLive was chosen, mainly because it can be used in custom radio stations (such a trivial, seemingly feature, but not supported by most streaming radio players - here's the ShoutCast / Uber Stations directory - choose and listen to what they give), as well as for being able to pump and display the covers of playing tracks. Yes, of course, not all, but he can. The music played, the covers partially showed, and for some time the internal perfectionist calmed down, but as it turned out - not for long.

    After a while, an extremely unpleasant application bug surfaced due to incorrect Unicode processing - if the name of the track and artist was not in Latin - the album cover was displayed incorrectly. Moreover, it is always the same. And I’ll tell you even more - for some reason this has always been Nyusha. I could not endure this already.

    image
    A screenshot illustrating how XiiaLive encroached on the holy.

    We could wait until the developers fix this bug , but judging sensibly that they are unlikely to have covers for everything that is in rotation exactly at my station (they definitely will not have covers for Ishome, Interior Disposition, tmtnsft and so more MΣ $ † ΛMN ΣKCПØNΛ †), it seemed more correct to write your api for covers. Which will be able to work precisely on the local database of files with music and, if possible, without reference to a specific broadcast server.

    Exploring the issue


    It was not possible to find a description of the standard protocol for the return of covers (I assume that there is no single standard at all), so I decided to go from the opposite - see how it is implemented in large uncles, in particular in the same XiiaLive. We are armed with Packet Capture on Android, we catch packages and we look where the application goes and why:

    GET /songart.php?partner_token=7144969234&title=Umbrella&artist=The+Baseballs&res=hi HTTP/1.1
    User-Agent: Dalvik/1.6.0 (Linux; U; Android 4.4.2; QuadCore-R16 Build/KVT49L)
    Host: api.dar.fm
    Connection: Keep-Alive
    Accept-Encoding: gzip
    HTTP/1.1 200 OK
    Server: Apache/2.2.15 (CentOS)
    X-Powered-By: PHP/5.3.3
    Set-Cookie: PHPSESSID=u5sgs13h1315k9184nvvutaf33; expires=Fri, 03-Aug-2018 18:39:08 GMT; path=/; domain=.dar.fm
    Expires: Thu, 19 Nov 1981 08:52:00 GMT
    Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
    Pragma: no-cache
    X-Train: wreck="mofas16"
    Content-Type: application/xml; charset=UTF-8
    Content-Length: 57
    Accept-Ranges: bytes
    Date: Thu, 03 Aug 2017 18:39:08 GMT
    Via: 1.1 varnish
    Age: 0
    Connection: keep-alive
    X-Served-By: cache-ams4143-AMS
    X-Cache: MISS
    X-Cache-Hits: 0
    X-Timer: S1501785548.973935,VS0,VE390
    

    It turned out that a regular GET request was sent with four variables:

    • partner_token - authorization token, upon request without it, or with the wrong token - 403 is returned.
    • title - title of the track
    • artist - artist name
    • res - the desired resolution of the picture. A simple search gave the following set of issued permissions (covers are square, so the resolution is described by one number):
      * hi - 1080 px
      * low - 250 px
      * in all other cases - 400 px

    In response to the request, the application expects an xml of the following form in response:

    http://coverartarchive.org/release/c8b16143-e87e-440d-bbb2-5c96615bed2b/2098621288-500.jpg
    	The BaseballsTik Tok1080

    And the next request, the application expectedly goes to the static server for the picture. If nothing is found using the combination “Artist” + “Track Name”, then an empty xml is returned:


    Design


    Okay, the input and output parameters of the black box are defined, it remains to build the logic of its work. And most importantly - decide where we will get the cover for the requested track.

    To make a separate database of cover pictures, somehow connect it with tracks, keep it up to date - I'm too lazy. Yes, and do not multiply entities, start some extra bases and connections, because the ID3v2 mp3 tag format has supported the storage of covers in mp3 files themselves for many years, so we’ll go inside the file for the cover (if there is, of course). And if the file is not found (or there is no cover in it), then instead of an empty xml, we’d better give one of the default covers for the radio station so that the user does not look into an empty square.

    In general, I prefer to script everything and do less work with my hands. For example, what the file addition to rotation looks like now: I uploaded the file via ftp / scp to the inbox directory and forgot. A minute later, a service script arrived, found a file, renamed it as needed, and transferred it to the radio station directory. And once every 10 minutes, LiquidSoap re-reads the directory, finds a new file and adds it to the playlist. A cover request will come - the script will find the file and extract the cover from it.

    With a good system administrator, even Sysadmin Day is automatically marked.
    According to cron.

    True, in the process of implementation and testing, the logic has become somewhat more complicated. After all, often there is still cover.jpg in the album catalog (for artists who are present in rotation with entire albums). And there are also numerous artists from SoundCloud / PromoDJ and just from vk who rarely collect tracks into albums, or even care about the cover art for the track. For these artists (there are not so many of them), we will create on the static server a separate directory with default covers by the name of the artist.

    Last question: how to find the file corresponding to the requested tags on the disk, considering that at the start of the search we have only the name of the artist and the name of the track? You can store information somewhere in the database with the keys “artist, track -> file on disk”, you can go through the files, watch mp3 tags in them comparing them with the request (but this takes a long time), or you can simply follow the principle of not multiplying entities store files on a disk with the names "% artist% -% title% .mp3". I did just that. Once, for this, I used the best, in my opinion, TagScanner program from Sergey Serkov for these purposes , and then switched to a python script that automatically renames files in the desired format.

    The final logic of the work is as follows:

    • Received a GET request.
    • If the request is empty (does not contain GET parameters) - returns empty XML
    • If authorization by tokens is enabled (non-zero tokens list in the configuration file), the incoming token is checked. If the token is incorrect - 401 Unauthorized.
    • If the request contains artist and title variables, a search is made in the local directory of mp3 files:
    • If the file is not found, empty XML is returned
    • If the file is found, the sequence is as follows:
    • We check - is there a ready-made cover for this file in the cover catalog? If there is - give a link to it.
    • If there is a cover in the file, we extract it into the catalog with covers, give the link.
    • If in the directory with the .mp3 file there is an album cover (file cover.jpg) - transfer it to the album cover catalog, give a link to it.
    • If there is a cover in the artists directory with the name `artist` - we give a link to it.
    • If nothing is found at all - we give a random picture from the catalog of default covers of the radio station.

    Well, now that the logic of work is defined, it remains only to formalize it in the form of functions.

    The code


    To extract covers from mp3 files, use the mutagen module. Function that extracts covers from mp3 files and writes them in .jpg:

    import mutagen.mp3
    def extract_cover(local_file, covers_dir, cover_name):
       """
       Extracts cover art from mp3 file
       :param local_file: file name (with path)
       :param covers_dir: path to store cover art files
       :param cover_name: name for extracted cover art file
       :rtype: bool
       :return:
           False - file not found or contains no cover art
           True - all ok, cover extracted
       """
       try:
           tags = mutagen.mp3.Open(local_file)
           data = ""
           for i in tags:
               if i.startswith("APIC"):
                   data = tags[i].data
                   break
           if not data:
               return False
           else:
               with open(covers_dir + cover_name, "w") as cover:
                   cover.write(data)
                   return True
       except:
           logging.error('extract_cover: File \"%s\" not found in %s', local_file, covers_dir)
           return False

    If there is a cover in the file and we have successfully extracted it, we resize to the desired size while preserving the proportions of the picture (because standard square covers are not always in the file). The Python Imaging Library (PIL), which also knows how to do antialias, does a great job of this:

    from PIL import Image
    def resize_image(image_file, new_size):
       """
       Resizes image keeping aspect ratio
       :param image_file: file name (with full path)
       :param new_size: new file max size
       :rtype bool
       :return:
           False - resize unsuccessful or file not found
           True - otherwise
       """
       try:
           img = Image.open(image_file)
       except:
           return False
       if max(img.size) != new_size:
           k = float(new_size) / float(max(img.size))
           new_img = img.resize(tuple([int(k * x) for x in img.size]), Image.ANTIALIAS)
           img.close()
           new_img.save(image_file)
       return True

    Despite the fact that almost all modern programs can adjust the cover size to the screen size themselves, I would highly recommend doing it yourself, on the server side.

    In my practice there was a case when half of the 15 megabyte .mp3 file (7.62 mb) was occupied by a cover of 3508x3508 sizes, besides with a non-standard color profile. This file tightly hung the TagScanner program, which I use to edit tags. I don’t know how much this file would be sent over 3G connection, and what would happen to Android when trying to fit it to the screen size.

    Since XiiaLive has no settings for choosing a cover server, I had to change the api.dar.fm address to which it refers to my own. On rooted Android, it's simple:

    /etc/hosts
    		api.dar.fm

    And we explain to Nginx that all incoming requests, regardless of where they came and what they want, are served by our script. At the same time, we raise the virtual host for static, from where pictures will be given. Of course, you can do everything within the same host, but still it is better to fly the api separately, and the cutlets of static - separately.

    upstream fcgiwrap_factory {
       server                        unix:/run/fcgiwrap.socket;
       keepalive                     32;
    }
    server {
       listen                        80;
       server_name                   api. api.dar.fm;
       root                          /var/wwws//api;
       access_log                    /var/log/nginx/api.access.log main;
       error_log                     /var/log/nginx/api.error.log;
       location / {
           try_files                 $uri    /api.py?$args;
       }
       location ~ api.py {
           fastcgi_pass              fcgiwrap_factory;
           include                   /etc/nginx/fastcgi_params;
           fastcgi_param             SCRIPT_FILENAME   $document_root$fastcgi_script_name;
       }
    }
    server {
        listen                        80;
        server_name                   static.
        root                          /var/wwws//static;
        access_log                    /var/log/nginx/static.access.log main;
        error_log                     /var/log/nginx/static.error.log;
        index                         index.html;
        location / {
        }
    }
    

    After fixing bugs and finishing thin spots - it worked. Music plays, pictures are extracted from mp3 files and added to the host directory with static for upload via the web. In theory, after some time, all covers transfer from the bowels of mp3 files to the static directory, but, firstly, the process of extracting the cover takes an average of 100 ms, and secondly, the hosting space is still not rubbery, so the pictures through then they are deleted by the simplest single-line on the tower, which hangs in its crown and deletes files that were accessed more than a week ago:

    find /var/wwws//static/covers/ -maxdepth 1 -type f -iname '*.jpg' -atime +7 -exec rm {} \;

    Of course, for this to work, noatime should be set on the music section.

    image
    Well, that’s how it worked.

    Completion


    A week later, I analyzed the server logs and found an interesting one: right after launch, the application sends a request of the form:

    GET /songart.php?partner_token=7144969234&res=hi HTTP/1.1" 200 334 "-" "Dalvik/1.6.0 (Linux; U; Android 4.4.2; QuadCore-R16 Build/KVT49L)

    And only some time later:

    GET /songart.php?partner_token=7144969234&title=Summer+Nights&artist=John+Travolta+and+Olivia+Newton-John&res=hi HTTP/1.1" 200 334 "-" "Dalvik/1.6.0 (Linux; U; Android 4.4.2; QuadCore-R16 Build/KVT49L)

    Accordingly, there is no cover on the screen between these two requests, darkness and gloom.

    The mess.

    The reason is clear: the application at launch has not yet managed to extract tags from the stream and does not know what it is playing, why not help it? We add the first paragraph one more condition to the logic of the program:

    • If a GET request came with an authorization token, but without specifying the artist and track name - give a picture for the current track being played. If there is a stream variable - from the requested broadcast stream, otherwise - from the one that we consider to be the main one.

    But where to get the name of the current track? Do not blame server logs. It is very fortunate that Icecast can render the state of mounted points in XML or JSON format. JSON for Python is more native, so we will use it. Because Icecast-KH doesn’t have such statistics “out of the box”, we ’ll use the xsl file from an article by respected namikiri , which was insensitively modified by me:

    
           {
           
               "":
               {
               "name" : "",
               "listeners" : "",
               "listener_peak" : "",
               "description" : "",
               "title" : "",
               "genre" : "",
               "url" : ""
               }
               ,
           }
       

    We put the file in the web directory of Icecast-kh (on Ubuntu, by default it is / usr / local / share / icecast / web /), and when accessing via http we get something like this in response:

    {
                "/256":
                {
                "name" : "Radio /256kbps",
                "listeners" : "2",
                "listener_peak" : "5",
                "description" : "mp3, 265kbit",
                "title" : "The Kelly Family - Fell In Love With An Alien",
                "genre" : "Various",
                "url" : ""
                },
                "/128":
                {
                "name" : "Radio /128kbps",
                "listeners" : "0",
                "listener_peak" : "1",
                "description" : "mp3, 128kbit",
                "title" : "The Kelly Family - Fell In Love With An Alien",
                "genre" : "Various",
                "url" : ""
                },
                "/64":
                {
                "name" : "Radio /64kbps",
                "listeners" : "0",
                "listener_peak" : "2",
                "description" : "mp3, 64kbit",
                "title" : "The Kelly Family - Fell In Love With An Alien",
                "genre" : "Various",
                "url" : ""
                }
    }

    As you can see, the radio has three mount points (actually several more) broadcasting the same stream, but with different quality. Well, then everything is very simple:

    import urllib2
    import json
    def get_now_playing(stats_url, stats_stream):
       """
       Retruns current playing song - artist and title
       :param stats_url: url points to icecast stats url (JSON format)
       :param stats_stream: main stream to get info
       :return: string "Artist - Title"
       """
       try:
           stats = json.loads(urllib2.urlopen(stats_url).read())
       except:
           logging.error('get_current_song: Can not open stats url \"%s\"', stats_url)
           return False
       if stats_stream not in stats:
           logging.error('get_current_song: Can not find stream \"%s\" in stats data', stats_stream)
           return False
       return stats[stats_stream]['title'].encode("utf-8")

    The function goes to the specified statistics address, and returns the artist and the title of the current song from the desired stream. The stream comes either in the request, or default (from the settings) is taken.

    Web


    Now it's time to do the site. For online playback, I have long been using a free flash player from uppod in minimal settings, which looks in / flv stream and displays the track being played during playback. It looks like this:

    image

    And to display the current track when the player is minimized or inactive, I, like many others who encountered this problem, until recently used a .php script on the server that went to Icecast for statistics and returned a line with the name of the track being played. It’s time to get rid of the intermediate steps, and I would like to show the covers on the site during online playback, since now I can give them away.

    The problem is solved in two steps:

    Add a custom header to the Nginx configuration for api that allows you to access it through jQuery from another host:

    add_header                      Access-Control-Allow-Origin *;

    And we put the following script in the body of the web page of the radio station:

    var now_playing = '';
    setInterval(function () {
       jQuery.ajax(
           {
               type: "GET",
               url: "http://api./?partner_token=&stream=/",
               dataType: "xml",
               success: xmlParser
           })
    }, 5000);
    function xmlParser(xml) {
       var parsedXml = jQuery(xml);
       var title = parsedXml.find('title').text();
       var artist = parsedXml.find('artist').text();
       var arturl = parsedXml.find('arturl').text();
       var song = artist.concat(" — ").concat(title);
       if (now_playing !== song) {
           jQuery('div.now_playing').html(song);
           jQuery('div.cover_art').html(arturl);
           now_playing = song;
       }
    };

    As you can see, once every five seconds the script goes to the same place as the application, logs in there, receives the .xml file and takes the track to be played and the link to the cover from it. And if they have changed since the last check, then he writes them to the necessary divs of the radio station’s web pages for display. I immediately ask the gentlemen of the front-end developers not to swear at the possible clumsiness of the script - jQuery I see the first (well, the second), once in a lifetime. The script may be ugly, but it works great.

    image
    Under the player, another div is added in which the covers change dynamically.

    Conclusion


    On this, all the tasks outlined are solved. Radio broadcasts like many years before, but now it also displays the covers of playing tracks and does it right. The little perfectionist inside my head sleeps, satisfyingly sniffing, and does not distract from work.

    I understand that the topic described is rather narrowly specific, and may be of interest to a small circle of people, but I think that my experience will still be useful to someone. So the full texts of all the code described above, plus examples of Nginx settings and a description of the installation, are available on GitHub .

    All the music!

    Also popular now: