sallyruthstruik April 28, 2013 at 02:31

Getting your favorite audio recordings from pandora.com

For those who don’t know, pandora.com is an Internet radio that picks up songs according to users preferences. Recently, a friend of mine wanted to download a list of their favorite audio recordings. But on the Pandora itself, there is no such possibility. Therefore, I had to get into her gut ...

So, from Pandora we will receive a list of song and artist names, then we will download them using the contact API.

Step 1. We go to the Pandora, and see what happens when we request a list of favorite songs. We observe such a request:

Request URL:http://www.pandora.com/content/tracklikes?likeStartIndex=0&thumbStartIndex=5&webname=evgeny.vyalyy&cachebuster=1367100054190
Request Method:GET
Status Code:200 OK
Request Headersview source
Accept:*/*
Accept-Charset:windows-1251,utf-8;q=0.7,*;q=0.3
Accept-Encoding:gzip,deflate,sdch
Accept-Language:ru,en-US;q=0.8,en;q=0.6
Cookie:at=wNCFSbEDa7LTetjSbEwrXhkSGCSClV6j9vdiwaygcF8uwpsRlRg7usr3YsGsoHBfLJI3/y+zfNsMtHtvG5AA2Qg%3D%3D; v3ad=1:20:1:48206::5:0:0:0:505:011:MI:26163:0:1:0:0; __utma=118078728.1866197791.1367091864.1367091864.1367098565.2; __utmb=118078728.4.10.1367098565; __utmc=118078728; __utmz=118078728.1367091864.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); v2regbstage=true; atn=AT-1367099945481-858
Host:www.pandora.com
Proxy-Connection:keep-alive
Referer:http://www.pandora.com/profile/likes/evgeny.vyalyy
User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.22 (KHTML, like Gecko) Ubuntu Chromium/25.0.1364.160 Chrome/25.0.1364.160 Safari/537.22
X-Requested-With:XMLHttpRequest
Query String Parametersview sourceview URL encoded
likeStartIndex:0
thumbStartIndex:5
webname:evgeny.vyalyy
cachebuster:1367100054190v

Let's try to simulate this query. We use a bunch of python requests + BeautifulSoup:

resp = response.get("http://www.pandora.com/content/tracklikes?likeStartIndex=0&thumbStartIndex=5&webname=evgeny.vyalyy&cachebuster=1367100054190", 
    headers={"Cookie":"at=wNCFSbEDa7LTetjSbEwrXhkSGCSClV6j9vdiwaygcF8uwpsRlRg7usr3YsGsoHBfLJI3/y+zfNsMtHtvG5AA2Qg%3D%3D; v3ad=1:20:1:48206::5:0:0:0:505:011:MI:26163:0:1:0:0; __utma=118078728.1866197791.1367091864.1367091864.1367098565.2; __utmb=118078728.4.10.1367098565; __utmc=118078728; __utmz=118078728.1367091864.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); v2regbstage=true; atn=AT-1367099945481-858"})
soup = BeautifulSoup.BeautifulSoup(resp.text)
print soup

We get a lot of a lot of not very informative html.

But our request contains somehow suspiciously many parameters. Let's try to shorten a little:

resp = response.get("http://www.pandora.com/content/tracklikes?likeStartIndex=0&thumbStartIndex=5&webname=evgeny.vyalyy", 
    headers={"Cookie":"at=wNCFSbEDa7LTetjSbEwrXhkSGCSClV6j9vdiwaygcF8uwpsRlRg7usr3YsGsoHBfLJI3/y+zfNsMtHtvG5AA2Qg%3D%3D;"})
soup = BeautifulSoup.BeautifulSoup(resp.text)
print soup

Hooray, the answer has not changed!
Now, delving into the answer, we get that all the information is stored in the div with the infobox-body class. Here's what this div looks like:

Sweet Home Alabama (Live From Freedom Hall)

						by Lynyrd Skynyrd
You liked this on The Offspring Radio.

So, now we can pull out all the information we are interested in:

import re
PATT = re.compile(">(.*?)<")
for x in soup.findAll(attrs={"class":"infobox-body"}):
        print [PATT.findall(str(x.a))[0], PATT.findall(str(x.p.a))[0]]

The first step is over! =)

Step two. Search and download records from vk.com

Go to vk.com/editapp?act=create and create a new application. Now we need to get access_token. In order not to suffer, I decided to get access_token manually, and just insert it into the body of the script. So, go to Us redirect to a new page . We pull out the access_token of interest to us from the anchor. We will use it for requests to vk.api. We write a small audio search function:

oauth.vk.com/authorize?client_id=3608669&scope=audio&redirect_uri=https://oauth.vk.com/blank&display=wap&response_type=token

oauth.vk.com/blank.html#access_token=***&expires_in=86400&user_id=17738938

ACCESS_TOKEN = ***
def audio_search(string):
    resp = r.get("https://api.vk.com/method/audio.search?q=%(q)s&sort=2&access_token=%(ACCESS_TOKEN)s"%{"q":string, "ACCESS_TOKEN":ACCESS_TOKEN})
    return resp.json()

It returns the most popular string string search result (among audio recordings).
The function response is:

>>> audio_search("My little horse")
{u'response': [1, {u'album': u'27504721', u'artist': u'\u041d\u0435\u0438\u0437\u0432\u0435\u0441\u0442\u0435\u043d', u'url': u'http://cs521522.vk.me/u3391535/audios/746ddef4902c.mp3', u'title': u'my little horse', u'duration': 208, u'aid': 159749117, u'owner_id': 3391535}]}

Now we know the url to download. You can download it using the standard function urllib.urlretrieve.

So we got the following script:

yadi.sk/d/7bP26GIQ4POa6

How to work with it:

1) The script requires installed requests and BeautifulSoup packages (sudo pip install requests BeautifulSoup)
2) You need to get the cookie value at = ... from pandora.com (see. above)
3) You need to get ACCESS_TOKEN as done above
4) You need to set the parameter COUNT_OF_SONGS - the number of songs you want to download (None, if you need to download everything)
5) DOWNLOAD_FOLDER_NAME = "audio" - the directory where the downloaded music will be saved.
6) LOGIN - your login on pandora.com The

corresponding parameters should be written in the body of the script.
Listen to your favorite music, and remember that piracy is a sin =)

UPD . Accidentally forgot to update the login code. I apologize
UPD2 At the request of the user DenimTornado the same script for lastfm

yadi.sk/d/U7kAZFZh4P5Yz

Parameters for setting:

COUNT_OF_SONGS = None - the number of songs. By default, download everything (but not more than 1000)
ACCESS_TOKEN = "" - see above how to get it
LOGIN = "sallyruthstruik" - your login on lastfm

UPD3

from user Setti :

Modified version LastFM
yadi.sk/d/tagClpSf4VsqQ

+ Added BeautifulSoup in the folder with the script. Now it is not necessary to install it
+ In the old version, the search took place only by the name of the track. Now by the name of the artist. Otherwise, the contact simply gives out anything.
+ Fixed naming of downloaded files: special characters are deleted
+ File names that are too long are truncated + Displayed
in separate settings for lastfm request: limit and page. Now you can load packs of 10, 50, 100, 500, etc. tracks page by page. If you have too many tracks, or you want to track the download result using the example of a slice, set the appropriate page and limit parameters

Tags:

Getting your favorite audio recordings from pandora.com

Sweet Home Alabama (Live From Freedom Hall)

Also popular now: