Getting your favorite audio recordings from pandora.com

    For those who don’t know, pandora.com is an Internet radio that picks up songs according to users preferences. Recently, a friend of mine wanted to download a list of their favorite audio recordings. But on the Pandora itself, there is no such possibility. Therefore, I had to get into her gut ...


    So, from Pandora we will receive a list of song and artist names, then we will download them using the contact API.

    Step 1. We go to the Pandora, and see what happens when we request a list of favorite songs. We observe such a request:
    Request URL:http://www.pandora.com/content/tracklikes?likeStartIndex=0&thumbStartIndex=5&webname=evgeny.vyalyy&cachebuster=1367100054190
    Request Method:GET
    Status Code:200 OK
    Request Headersview source
    Accept:*/*
    Accept-Charset:windows-1251,utf-8;q=0.7,*;q=0.3
    Accept-Encoding:gzip,deflate,sdch
    Accept-Language:ru,en-US;q=0.8,en;q=0.6
    Cookie:at=wNCFSbEDa7LTetjSbEwrXhkSGCSClV6j9vdiwaygcF8uwpsRlRg7usr3YsGsoHBfLJI3/y+zfNsMtHtvG5AA2Qg%3D%3D; v3ad=1:20:1:48206::5:0:0:0:505:011:MI:26163:0:1:0:0; __utma=118078728.1866197791.1367091864.1367091864.1367098565.2; __utmb=118078728.4.10.1367098565; __utmc=118078728; __utmz=118078728.1367091864.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); v2regbstage=true; atn=AT-1367099945481-858
    Host:www.pandora.com
    Proxy-Connection:keep-alive
    Referer:http://www.pandora.com/profile/likes/evgeny.vyalyy
    User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.22 (KHTML, like Gecko) Ubuntu Chromium/25.0.1364.160 Chrome/25.0.1364.160 Safari/537.22
    X-Requested-With:XMLHttpRequest
    Query String Parametersview sourceview URL encoded
    likeStartIndex:0
    thumbStartIndex:5
    webname:evgeny.vyalyy
    cachebuster:1367100054190v
    


    Let's try to simulate this query. We use a bunch of python requests + BeautifulSoup:

    resp = response.get("http://www.pandora.com/content/tracklikes?likeStartIndex=0&thumbStartIndex=5&webname=evgeny.vyalyy&cachebuster=1367100054190", 
        headers={"Cookie":"at=wNCFSbEDa7LTetjSbEwrXhkSGCSClV6j9vdiwaygcF8uwpsRlRg7usr3YsGsoHBfLJI3/y+zfNsMtHtvG5AA2Qg%3D%3D; v3ad=1:20:1:48206::5:0:0:0:505:011:MI:26163:0:1:0:0; __utma=118078728.1866197791.1367091864.1367091864.1367098565.2; __utmb=118078728.4.10.1367098565; __utmc=118078728; __utmz=118078728.1367091864.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); v2regbstage=true; atn=AT-1367099945481-858"})
    soup = BeautifulSoup.BeautifulSoup(resp.text)
    print soup
    


    We get a lot of a lot of not very informative html.

    But our request contains somehow suspiciously many parameters. Let's try to shorten a little:

    resp = response.get("http://www.pandora.com/content/tracklikes?likeStartIndex=0&thumbStartIndex=5&webname=evgeny.vyalyy", 
        headers={"Cookie":"at=wNCFSbEDa7LTetjSbEwrXhkSGCSClV6j9vdiwaygcF8uwpsRlRg7usr3YsGsoHBfLJI3/y+zfNsMtHtvG5AA2Qg%3D%3D;"})
    soup = BeautifulSoup.BeautifulSoup(resp.text)
    print soup
    


    Hooray, the answer has not changed!
    Now, delving into the answer, we get that all the information is stored in the div with the infobox-body class. Here's what this div looks like:



    So, now we can pull out all the information we are interested in:

    import re
    PATT = re.compile(">(.*?)<")
    for x in soup.findAll(attrs={"class":"infobox-body"}):
            print [PATT.findall(str(x.a))[0], PATT.findall(str(x.p.a))[0]]
    


    The first step is over! =)

    Step two. Search and download records from vk.com

    Go to vk.com/editapp?act=create and create a new application. Now we need to get access_token. In order not to suffer, I decided to get access_token manually, and just insert it into the body of the script. So, go to Us redirect to a new page . We pull out the access_token of interest to us from the anchor. We will use it for requests to vk.api. We write a small audio search function:
    oauth.vk.com/authorize?client_id=3608669&scope=audio&redirect_uri=https://oauth.vk.com/blank&display=wap&response_type=token


    oauth.vk.com/blank.html#access_token=***&expires_in=86400&user_id=17738938






    ACCESS_TOKEN = ***
    def audio_search(string):
        resp = r.get("https://api.vk.com/method/audio.search?q=%(q)s&sort=2&access_token=%(ACCESS_TOKEN)s"%{"q":string, "ACCESS_TOKEN":ACCESS_TOKEN})
        return resp.json()
    


    It returns the most popular string string search result (among audio recordings).
    The function response is:

    >>> audio_search("My little horse")
    {u'response': [1, {u'album': u'27504721', u'artist': u'\u041d\u0435\u0438\u0437\u0432\u0435\u0441\u0442\u0435\u043d', u'url': u'http://cs521522.vk.me/u3391535/audios/746ddef4902c.mp3', u'title': u'my little horse', u'duration': 208, u'aid': 159749117, u'owner_id': 3391535}]}
    


    Now we know the url to download. You can download it using the standard function urllib.urlretrieve.

    So we got the following script:

    yadi.sk/d/7bP26GIQ4POa6

    How to work with it:

    1) The script requires installed requests and BeautifulSoup packages (sudo pip install requests BeautifulSoup)
    2) You need to get the cookie value at = ... from pandora.com (see. above)
    3) You need to get ACCESS_TOKEN as done above
    4) You need to set the parameter COUNT_OF_SONGS - the number of songs you want to download (None, if you need to download everything)
    5) DOWNLOAD_FOLDER_NAME = "audio" - the directory where the downloaded music will be saved.
    6) LOGIN - your login on pandora.com The

    corresponding parameters should be written in the body of the script.
    Listen to your favorite music, and remember that piracy is a sin =)

    UPD . Accidentally forgot to update the login code. I apologize
    UPD2 At the request of the user DenimTornado the same script for lastfm

    yadi.sk/d/U7kAZFZh4P5Yz

    Parameters for setting:
    • COUNT_OF_SONGS = None - the number of songs. By default, download everything (but not more than 1000)
    • ACCESS_TOKEN = "" - see above how to get it
    • LOGIN = "sallyruthstruik" - your login on lastfm


    UPD3

    from user Setti :

    Modified version LastFM
    yadi.sk/d/tagClpSf4VsqQ

    + Added BeautifulSoup in the folder with the script. Now it is not necessary to install it
    + In the old version, the search took place only by the name of the track. Now by the name of the artist. Otherwise, the contact simply gives out anything.
    + Fixed naming of downloaded files: special characters are deleted
    + File names that are too long are truncated + Displayed
    in separate settings for lastfm request: limit and page. Now you can load packs of 10, 50, 100, 500, etc. tracks page by page. If you have too many tracks, or you want to track the download result using the example of a slice, set the appropriate page and limit parameters

    Also popular now: