walkmanake January 10, 2012 at 15:54

Another way to update torrents

From the sandbox

On one tracker, I am an active sider. But when it comes time to update distributions, the horror begins for me: some distributions have different names in the torrent client and on the tracker, there are a lot of distributions with the same name on the tracker, and it’s very difficult to search for a specific distribution. Besides, I don’t have so much time to do such a routine business. Therefore, I needed a small script that would update distributions in the client when updating them on the tracker.

What to do?

I faced a task: to find some ready-made solution or to try to write the necessary script myself. There were methods on the hub that in some way fulfilled my task, but the method either didn’t suit me, or I didn’t really like it. At the same time, I never wrote programs or even scripts, so I liked the version of my own written script even more. First I had to choose a tool, a language that was easy to learn and immerse into programming, and python caught my attention.

I liked Python right away. It seems that it gives some "ease" in writing code. As the first python reader, I chose Mark Lutz's book Learning Python (4th Edition). Well, there is a tool, there is no help in the form of a book, let's go!

Statement of the problem and its solution

So, first you need to determine that the torrent file in our client (in this case, uTorrent 2.2 is meant) is outdated and you need to download a new one. The first thing I could come up with was parsing the pages and comparing them with the data in the torrent file. This method worked, but it had a huge minus in speed: parsing a hundred pages, namely such a limit of distributions on the tracker, took about three minutes. In addition, it was necessary to compare all the parameters of the distribution with the result of parsing the page, and this also took a lot of time. This method worked without failures, but I did not particularly like it, so I continued to search for all kinds of solutions to the problem.

Soon, after much deliberation and searching, I learned about such a thing as scrape. Scrape, as Wikipedia says, is an additional protocol for a client’s request for a tracker, in which the tracker tells the client the total number of seeds and feasts on the distribution. Using a scrape request, you can easily find out if a distribution exists or not. Also scrape-request is sent by clients more often than announce. But you need to know whether a particular tracker supports this protocol or not. Luckily, my tracker supports it. The scrape request is sent using the GET method with a header and this is how the address the request goes is:

httр://example.com/scrape.php?info_hash=aaaaaaaaaaaaaaaaaaaa

The hash is unique for each distribution, it includes 20 characters and can be obtained from the resume.dat file. But before you get the information, you need to know that this file, as files with the extension .torrent and settings.dat, is presented in bencode format. If you need to decrypt a file quickly and without deepening in the encoding method, then you should download a special package for python here .

Let's decrypt the file:

# -*- coding: utf-8 -*-
import urllib2
from urllib import urlencode
from binascii import b2a_hex as bta, a2b_hex as atb
from os import remove
from shutil import move
from lxml.html import document_fromstring as doc
from bencode import bdecode, bencode
from httplib2 Http
http = Http()
username = 'username'
password = 'password'
ut_port = '12345' # Порт web-морды у uTorrent'а.
ut_username = 'utusername'
ut_password = 'utpassword'
site = 'http://example.com/'
scrape_body = site + 'scrape.php?info_hash='  # URL scrape-запроса.
login_url = site + 'takelogin.php'
torrent_body = site + 'download.php?id={0}&name={0}.torrent'
announce = site + 'announce.php?'  # URL анонса трекера.
webui_url = 'http://127.0.0.1:{0}/gui/'.format(ut_port)
webui_token = webui_url + 'token.html'
# Папка с .torrent файлами. Путь записан в settings.dat, пункт dir_torrent_files.
torrent_path = 'c:/utorrent/torrent/'
# Папка автозагрузки указывается в настройках клиента.
autoload_path = 'c:/utorrent/autoload/'
# Папка с системными файлами uTorrent'a (нужно для обработки resume.dat)
sys_torrent_path = 'c:/users/myname/appdata/utorrent/'
def authentication(username, password):
    data = {'username': username, 'password': password}
    headers = {'Content-type': 'application/x-www-form-urlencoded',
    'User-agent':'Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.6'}
    resp, login = http.request(login_url, 'POST', headers=headers, body=urlencode(data))
    # Список имён атрибутов, подтверждающих авторизацию пользователя
    cookiekeys = ['uid', 'pass', 'PHPSESSID', 'pass_hash', 'session_id']
    split_resp = resp['set-cookie'].split(' ')
    lst = []
    # Далее оставляем только нужные нам атрибуты из ранее полученной строки.
    for split_res in split_resp:
        if split_res.split('=')[0] in cookiekeys:
                lst.append(split_res)
    cookie = ' '.join(lst)
    return {'Cookie': cookie}
def torrentDict(torr_path): #torr_path в нашем случае - папка с resume.dat .
    Dict = {}
    with open(u'{0}resume.dat'.format(torr_path), 'rb') as resume:
        t = bdecode(resume.read())
    for name in t:
        if name != '.fileguard' and name != 'rec':
            for tracker in t[name]['trackers']:
                if isinstance(tracker, str) and tracker.startswith(announce):
                    Dict[name.split('\\')[-1]] = bta(t[name]['info'])
    return Dict

Now we have in our hands a dictionary with the names and hashes of the distributions. Now we just have to send scrape requests with a substituted and modified hash and check if there is a distribution with such a hash on the tracker or if it is already gone. Also, do not forget that you need to make such a request on behalf of the client, otherwise the tracker will refuse access.

uthead = {'User-Agent':'uTorrent/2210(21304)'}  # Имитируем заголовки uTorrent'а.
main_dict = torrentDict(sys_torrent_path)
for key in main_dict:
    lst = []
    for i in range(0, len(main_dict[key]), 2):
        lst.append('%{0}'.format(main_dict[key][i:i+2].upper()))
    scrp_str = ''.join(lst)  # Строка, содержащая видоизменённый хэш для запроса.
    resp, scrp = http.request('{0}{1}'.format(scrape_body, scrp_str), 'GET', headers=uthead)

A typical response to a request looks like this:

d5:filesd20:aaaaaaaaaaaaaaaaaaaad8:completei5e10:downloadedi50e10:incompletei10eeee

20 characters “a” is a hash of distribution, 5 - siders, 10 - leechers and 50 finished downloading.
If the distribution does not exist, then the response to the request takes the form:

d5:filesdee

The response to the request is also presented in bencode format, but we don’t need to decrypt it, we can just compare the received string with the string returned if there is no distribution on the tracker with such a hash.
Next, you need to download our file from the tracker, put it in the startup folder of the client and, if possible, delete the entry about the obsolete torrent in the client itself.
It’s impossible to download a file from the tracker just like that: authorization is needed. The function itself is described above under the heading "authentication". And then we log in, download the file, put it in the startup folder and delete the old .torrent file from the folder with torrents.

    # Этот код находится по иерархии ниже строчки "for key in Dict:".
    with open('{0}{1}'.format(torrent_path, key), 'rb') as torrent_file:
        torrent = bdecode(torrent_file.read())
        t_id = torrent['comment'][36:]  # Здесь мы получаем уникальный номер раздачи на трекере.
    brhead = authentication(username, password)
    resp, torrent = http.request(torrent_body.format(t_id), headers=brhead)
    with open('{0}.torrent'.format(t_id),'wb') as torrent_file:
        torrent_file.write(torrent)
    # Удаляем старый .torrent файл и добавляем новый в папку автозагрузки.
    remove('{0}{1}'.format(torrent_path, key))
    move('{0}.torrent'.format(t_id), '{0}{1}.torrent'.format(autoload_path, t_id))
    # Код удаления записи о торренте. О нём ниже.
    authkey, token = uTWebUI(ut_username, ut_password)
    webuiActions(main_dict[key], 'remove', authkey, token)

So that the already non-existent .torrent file does not confuse us with its record in the client, it should be deleted from the client. But uTorrent is arranged so that editing resume.dat, and it is there that information about all torrents is stored, when the client is running, will not give a result: uTorrent will restore resume.dat the way it remembered it when it started. Therefore, for this case, you need to constantly turn off uTorrent, edit resume.dat, turn on uTorrent. Such a method would be suitable for one changed distribution per day, but what if distributions change in batches, i.e. several at once? At first, being far from programming in general, I thought that I would have to work with processes directly, and this is very difficult for me. But then I found out about the existence of uTorrent WebUI. WebUI has an API, documentation for which is on the official website. Thanks to the capabilities of the WebUI API, you can delete a record, and not only delete, about a torrent from a client. First, we need to get cookies that have a special password, and token. We need the second if the webui.token_auth parameter in the client is activated.

def uTWebUI(ut_name, ut_passw):
    # Получаем cookie и token.
    passmgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
    passmgr.add_password(None, webui_token, ut_name, ut_passw)
    authhandler = urllib2.HTTPBasicAuthHandler(passmgr)
    opener = urllib2.build_opener(authhandler)
    urllib2.install_opener(opener)
    req = urllib2.Request(webui_token)
    tkp = urllib2.urlopen(req)
    page = tkp.read()
    token = doc(page).xpath('//text()')[0]
    passw = req.unredirected_hdrs['Authorization']
    return passw, token
def webuiActions(torrent_hash, action, password, token):
    head = {'Authorization': password}
    if action == 'remove':
        # Удаляем запись в клиенте об устаревшей раздаче.
        action_req = '?token={0}&action=remove&hash={1}'.format(token, torrent_hash)
        r, act = http.request(webui_url+action_req, headers=head)

In uTorrent, authorization in the web interface is implemented differently than on the site, so simple data sending will not work. Then we get a token and together with it we perform some function in the client. Of course, it would be possible to allocate a class for actions in the client, but I figured that the usual function is enough for this.
(Note: Unfortunately, my knowledge at the moment was not enough to log in correctly in the web interface, so I used the method described on the Internet.)

What is the result

As a result, I got a script that satisfies my need, a little knowledge and a lot of pleasure: it’s a lot of fun to sit over the code until the morning, and then, when you go to bed, to understand what was the catch.

I hope this method can help someone.

UPD: I wildly apologize for my carelessness: I brought the code into a more readable form before publication, as a result of which I myself got confused and you got confused.
The code is uploaded to Github . I work with him for the first time, so if I did something wrong, please contact.

Tags:

Another way to update torrents

What to do?

Statement of the problem and its solution

What is the result

Also popular now: