Getting your favorite audio recordings from pandora.com
For those who don’t know, pandora.com is an Internet radio that picks up songs according to users preferences. Recently, a friend of mine wanted to download a list of their favorite audio recordings. But on the Pandora itself, there is no such possibility. Therefore, I had to get into her gut ...
So, from Pandora we will receive a list of song and artist names, then we will download them using the contact API.
Step 1. We go to the Pandora, and see what happens when we request a list of favorite songs. We observe such a request:
Let's try to simulate this query. We use a bunch of python requests + BeautifulSoup:
We get a lot of a lot of not very informative html.
But our request contains somehow suspiciously many parameters. Let's try to shorten a little:
Hooray, the answer has not changed!
Now, delving into the answer, we get that all the information is stored in the div with the infobox-body class. Here's what this div looks like:
So, now we can pull out all the information we are interested in:
The first step is over! =)
Step two. Search and download records from vk.com
Go to vk.com/editapp?act=create and create a new application. Now we need to get access_token. In order not to suffer, I decided to get access_token manually, and just insert it into the body of the script. So, go to Us redirect to a new page . We pull out the access_token of interest to us from the anchor. We will use it for requests to vk.api. We write a small audio search function:
It returns the most popular string string search result (among audio recordings).
The function response is:
Now we know the url to download. You can download it using the standard function urllib.urlretrieve.
So we got the following script:
yadi.sk/d/7bP26GIQ4POa6
How to work with it:
1) The script requires installed requests and BeautifulSoup packages (sudo pip install requests BeautifulSoup)
2) You need to get the cookie value at = ... from pandora.com (see. above)
3) You need to get ACCESS_TOKEN as done above
4) You need to set the parameter COUNT_OF_SONGS - the number of songs you want to download (None, if you need to download everything)
5) DOWNLOAD_FOLDER_NAME = "audio" - the directory where the downloaded music will be saved.
6) LOGIN - your login on pandora.com The
corresponding parameters should be written in the body of the script.
Listen to your favorite music, and remember that piracy is a sin =)
UPD . Accidentally forgot to update the login code. I apologize
UPD2 At the request of the user DenimTornado the same script for lastfm
yadi.sk/d/U7kAZFZh4P5Yz
Parameters for setting:
UPD3
from user Setti :
Modified version LastFM
yadi.sk/d/tagClpSf4VsqQ
+ Added BeautifulSoup in the folder with the script. Now it is not necessary to install it
+ In the old version, the search took place only by the name of the track. Now by the name of the artist. Otherwise, the contact simply gives out anything.
+ Fixed naming of downloaded files: special characters are deleted
+ File names that are too long are truncated + Displayed
in separate settings for lastfm request: limit and page. Now you can load packs of 10, 50, 100, 500, etc. tracks page by page. If you have too many tracks, or you want to track the download result using the example of a slice, set the appropriate page and limit parameters
So, from Pandora we will receive a list of song and artist names, then we will download them using the contact API.
Step 1. We go to the Pandora, and see what happens when we request a list of favorite songs. We observe such a request:
Request URL:http://www.pandora.com/content/tracklikes?likeStartIndex=0&thumbStartIndex=5&webname=evgeny.vyalyy&cachebuster=1367100054190 Request Method:GET Status Code:200 OK Request Headersview source Accept:*/* Accept-Charset:windows-1251,utf-8;q=0.7,*;q=0.3 Accept-Encoding:gzip,deflate,sdch Accept-Language:ru,en-US;q=0.8,en;q=0.6 Cookie:at=wNCFSbEDa7LTetjSbEwrXhkSGCSClV6j9vdiwaygcF8uwpsRlRg7usr3YsGsoHBfLJI3/y+zfNsMtHtvG5AA2Qg%3D%3D; v3ad=1:20:1:48206::5:0:0:0:505:011:MI:26163:0:1:0:0; __utma=118078728.1866197791.1367091864.1367091864.1367098565.2; __utmb=118078728.4.10.1367098565; __utmc=118078728; __utmz=118078728.1367091864.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); v2regbstage=true; atn=AT-1367099945481-858 Host:www.pandora.com Proxy-Connection:keep-alive Referer:http://www.pandora.com/profile/likes/evgeny.vyalyy User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.22 (KHTML, like Gecko) Ubuntu Chromium/25.0.1364.160 Chrome/25.0.1364.160 Safari/537.22 X-Requested-With:XMLHttpRequest Query String Parametersview sourceview URL encoded likeStartIndex:0 thumbStartIndex:5 webname:evgeny.vyalyy cachebuster:1367100054190v
Let's try to simulate this query. We use a bunch of python requests + BeautifulSoup:
resp = response.get("http://www.pandora.com/content/tracklikes?likeStartIndex=0&thumbStartIndex=5&webname=evgeny.vyalyy&cachebuster=1367100054190",
headers={"Cookie":"at=wNCFSbEDa7LTetjSbEwrXhkSGCSClV6j9vdiwaygcF8uwpsRlRg7usr3YsGsoHBfLJI3/y+zfNsMtHtvG5AA2Qg%3D%3D; v3ad=1:20:1:48206::5:0:0:0:505:011:MI:26163:0:1:0:0; __utma=118078728.1866197791.1367091864.1367091864.1367098565.2; __utmb=118078728.4.10.1367098565; __utmc=118078728; __utmz=118078728.1367091864.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); v2regbstage=true; atn=AT-1367099945481-858"})
soup = BeautifulSoup.BeautifulSoup(resp.text)
print soup
We get a lot of a lot of not very informative html.
But our request contains somehow suspiciously many parameters. Let's try to shorten a little:
resp = response.get("http://www.pandora.com/content/tracklikes?likeStartIndex=0&thumbStartIndex=5&webname=evgeny.vyalyy",
headers={"Cookie":"at=wNCFSbEDa7LTetjSbEwrXhkSGCSClV6j9vdiwaygcF8uwpsRlRg7usr3YsGsoHBfLJI3/y+zfNsMtHtvG5AA2Qg%3D%3D;"})
soup = BeautifulSoup.BeautifulSoup(resp.text)
print soup
Hooray, the answer has not changed!
Now, delving into the answer, we get that all the information is stored in the div with the infobox-body class. Here's what this div looks like:
So, now we can pull out all the information we are interested in:
import re
PATT = re.compile(">(.*?)<")
for x in soup.findAll(attrs={"class":"infobox-body"}):
print [PATT.findall(str(x.a))[0], PATT.findall(str(x.p.a))[0]]
The first step is over! =)
Step two. Search and download records from vk.com
Go to vk.com/editapp?act=create and create a new application. Now we need to get access_token. In order not to suffer, I decided to get access_token manually, and just insert it into the body of the script. So, go to Us redirect to a new page . We pull out the access_token of interest to us from the anchor. We will use it for requests to vk.api. We write a small audio search function:
oauth.vk.com/authorize?client_id=3608669&scope=audio&redirect_uri=https://oauth.vk.com/blank&display=wap&response_type=token
oauth.vk.com/blank.html#access_token=***&expires_in=86400&user_id=17738938
ACCESS_TOKEN = ***
def audio_search(string):
resp = r.get("https://api.vk.com/method/audio.search?q=%(q)s&sort=2&access_token=%(ACCESS_TOKEN)s"%{"q":string, "ACCESS_TOKEN":ACCESS_TOKEN})
return resp.json()
It returns the most popular string string search result (among audio recordings).
The function response is:
>>> audio_search("My little horse")
{u'response': [1, {u'album': u'27504721', u'artist': u'\u041d\u0435\u0438\u0437\u0432\u0435\u0441\u0442\u0435\u043d', u'url': u'http://cs521522.vk.me/u3391535/audios/746ddef4902c.mp3', u'title': u'my little horse', u'duration': 208, u'aid': 159749117, u'owner_id': 3391535}]}
Now we know the url to download. You can download it using the standard function urllib.urlretrieve.
So we got the following script:
yadi.sk/d/7bP26GIQ4POa6
How to work with it:
1) The script requires installed requests and BeautifulSoup packages (sudo pip install requests BeautifulSoup)
2) You need to get the cookie value at = ... from pandora.com (see. above)
3) You need to get ACCESS_TOKEN as done above
4) You need to set the parameter COUNT_OF_SONGS - the number of songs you want to download (None, if you need to download everything)
5) DOWNLOAD_FOLDER_NAME = "audio" - the directory where the downloaded music will be saved.
6) LOGIN - your login on pandora.com The
corresponding parameters should be written in the body of the script.
Listen to your favorite music, and remember that piracy is a sin =)
UPD . Accidentally forgot to update the login code. I apologize
UPD2 At the request of the user DenimTornado the same script for lastfm
yadi.sk/d/U7kAZFZh4P5Yz
Parameters for setting:
- COUNT_OF_SONGS = None - the number of songs. By default, download everything (but not more than 1000)
- ACCESS_TOKEN = "" - see above how to get it
- LOGIN = "sallyruthstruik" - your login on lastfm
UPD3
from user Setti :
Modified version LastFM
yadi.sk/d/tagClpSf4VsqQ
+ Added BeautifulSoup in the folder with the script. Now it is not necessary to install it
+ In the old version, the search took place only by the name of the track. Now by the name of the artist. Otherwise, the contact simply gives out anything.
+ Fixed naming of downloaded files: special characters are deleted
+ File names that are too long are truncated + Displayed
in separate settings for lastfm request: limit and page. Now you can load packs of 10, 50, 100, 500, etc. tracks page by page. If you have too many tracks, or you want to track the download result using the example of a slice, set the appropriate page and limit parameters