Interception and editing of files of http-traffic on an example of a torrent

A couple of years ago, the idea came up to create a local bittorrent retracker for users of our “home” city network, so that users download faster and we have less traffic. The installation of the retracker itself was just beginning, it was necessary to somehow announce it for downloadable torrents. In the process of finding out the methods and mechanisms of the announcement, I came to a fairly general and universal algorithm, which I propose to get to know.

So the first one:

What to do


There are three ways to announce the existence of a tracker on the local network:
  • Use retracker.local address added by some trackers
  • Announce a local tracker using the isp.bep22 mechanism
  • Intercept downloaded torrent files and edit them, adding the address of our torrent, "on the fly"

Each of them has its advantages and disadvantages, respectively:
  • Using the .local zone contradicts the draft RFC " Multicast DNS " and causes problems in the operation of zeroconf services Linux and Apple; added by only a few trackers
  • Isp.bep22 works, as far as I know, only in the µTorrent client and it is turned off by default
  • There is no mention of traffic interception with success stories, except for the only experience of a friendly network.

To begin with, we made support for the first two options, since no special efforts were required for this: adding a few records to the DNS ( retracker.local IN A and retracker.smarthome.spb.ru _SRV_ ) is simple. In this case, you can close your eyes on incompatibility of .local with zeroconf, since ideally the DNS queries of the client with zeroconf enabled should not even reach our server. Update : An important remark by cadmi is that for the .local functionality for the user and our retracker, you need to create a zone not of .local, but of .retracker.local , which will allow you to combine both options.

But still, the third option looked the most interesting and attractive, so I decided to look for information about the general technique for changing downloaded files on the fly. Server requirements were simple:
  • Work on FreeBSD
  • Open source
  • Transparency (invisibility) of work for the client
  • Level 7 traffic detection and file recovery from http stream
  • Transferring these files to the editing script and getting them back
  • Transferring edited files to the client


What to do


To my surprise, I found that there are practically no technicians and open-source programs for such interception and editing of files. By and large, there are only two of them: this is Squid with experimental ICAP / ECAP modules and a certain filtering proxy called " MiddleMan ", the last release of which was released back in 2004, but which continues to be supported in ports.

I refused to use Squid almost immediately: despite the presence of two experimental modules for working with passing traffic at once, the solution turned out to be extremely “crooked” and unstable even in installation and configuration, not to mention work.

Moved to middleman. Amazingly, the fact is that the old program turned out to be more functional and more convenient than the modern Squid monster. In essence, it satisfies all the requirements, except for full transparency for the user - the source ip of the user is transferred to the server ip with the proxy. I note that only Squid with the TPROXY module under Linux has the ability to leave source ip . Moreover, it has a unique option - when the custom wait timeout is exceeded, the proxy gives the user the unchanged source file.

How to do


1. Definition of the most popular torrent servers

To start, I wrote a small perl script that listens to port 80 through pcap and collects the IP addresses that are requested from Content-Type: application / x-bittorrent. This is necessary in order to intercept not all http-traffic, but only the one that belongs to large trackers.

Then, by simple manipulations, these ip-addresses are entered into the ipfw table, used when redirecting to our proxy:

${ipfw} add fwd ${proxy_ip}, ${proxy_port} tcp from $lan_customers to 'table(15)' dst-port 80 in via ${int_if}

2. Middleman proxy configuration
The section responsible for uploading files to the editing script is called external in mman.xml .

3. Editing “external” script


Mypatcher.pl script receives files with the mime type “application / x-bittorrent”, it adds a local tracker record to it (removing retracker.local if it is there, which solves the problems of clients with zeroconf) and transfers the contents back to the proxy , simultaneously saving the file also to disk, in the following form: Work summary: on a server that performs NAT, shaping and routing a network of 3,000 users, mman loading is not noticeable at all. About 200-400 files are now edited this way per day. There were no complaints for almost a year of work, everyone is happy. Update 1 . “Saving files to disk” is solely for collecting anonymized statistics in the form indicated above. Well, I simply forgot to disable this script debugging mechanism. :) Cadm

#for i in `find /home/torrents/patched/ -type d`; do echo -n "$i" && ls -1 $i| wc -l; done | awk '{print $2" "$1}' | sort -rn | head -n 10

24103 /home/torrents/patched/dl.rutracker.org
7817 /home/torrents/patched/dl.torrents.ru
6184 /home/torrents/patched/tfile.ru
3744 /home/torrents/patched/kinozal.tv
2928 /home/torrents/patched/rutor.org
2872 /home/torrents/patched/torrents.thepiratebay.org
2583 /home/torrents/patched/www.tfile.ru
2582 /home/torrents/patched/www.torrentino.ru
2531 /home/torrents/patched/pornolab.net
1032 /home/torrents/patched/www.rutor.org





Also popular now: