Parsing BEncode in JavaScript. View torrent files in Firefox

    I. Why


    There are several ways to view torrent files: in the torrent client, in the BEncode Editor , in file managers with plugins, possibly in network services (but this is a bit dumb).

    But it is not always convenient to call an external program from the browser. This program does not always give full information. Not always convenient. Not always searchable. Therefore, I would like to have an easy way in the browser to view the torrent file, for example:

    - find out the contents of the distribution;
    - find out the number of files in the distribution;
    - find out information about files (some trackers are very lenient to incomplete descriptions, and more information about files appears in torrent files - for example, resolution, video and audio codecs, movie duration, etc.);
    - find out information about the torrent file itself (creation time, trackers, privacy flag, etc.);
    - be able to text search all the information.

    II. Strategy


    For our task, you can write an extension to the browser, but this is fraught with a number of additional difficulties. Therefore, we will use a simplified method. Custom Buttons

    Extensionallows you to create buttons with arbitrary code. Even better, this code runs in a browser context, has access to the same components and interfaces as extensions, and can even create GUI elements of arbitrary complexity. Therefore, we simply create a new button and fill it with code (two hundred lines are needed for everything). All the following code must be inserted into the initialization tab of the newly created button so that it is executed each time the browser is launched, determining the desired behavior of the button once and for the whole session. Or you can not insert it: the extension adds the custombutton: // protocol to the browser, and at the end of the article I will give a link, simply by clicking on which you can create a ready-made button with a code (you just have to transfer it from the tool palette to a convenient place).

    III. Tactics


    1. User Interface

    var btn = this;
    var imgMain = "";
    var imgThrobber = "";
    functionclickBtn(event) {
    	if (event.button == 0) {
    		event.preventDefault();
    		var tFileURL = prompt("Torrent File URL:");
    		if (tFileURL) {
    			getTFile(tFileURL);
    		}
    	}
    }
    functioncheckDrag(event) {
    	if (event.dataTransfer.types.contains("text/uri-list")) {
    		event.preventDefault();
    	}
    }
    functiononDrop(event) {
    	var tFileURL = event.dataTransfer.getData("URL");
    	if (tFileURL) {
    		getTFile(tFileURL);
    	}
    	event.preventDefault();
    }
    btn.addEventListener("click", clickBtn, true);
    btn.addEventListener("dragenter", checkDrag, true);
    btn.addEventListener("dragover", checkDrag, true);
    btn.addEventListener("drop", onDrop, true);
    btn.onDestroy = function() {
    	btn.removeEventListener("click", clickBtn, true);
    	btn.removeEventListener("dragenter", checkDrag, true);
    	btn.removeEventListener("dragover", checkDrag, true);
    	btn.removeEventListener("drop", onDrop, true);
    }
    


    In this piece, we get the button object, set two images (one is the main thing, the other is the standard file download indicator, they will alternate), we define event handlers and bind them to the button.

    We get two ways to give the program the address of the torrent file: if there is a link, we simply drag it to the button ( description of the mechanisms ). If there is a line with the address in the buffer, we click on the button and paste the address into the field that appears.

    At the end, we define destructors for bound handlers. There is an unpleasant bug in Custom Buttons: if you do not explicitly set the detachments, the handlers will overlap each time the tool palette is opened and closed (even if you configure something else with it).

    2. Getting the torrent file

    functiongetTFile(tFileURL) {
    	btn.image = imgThrobber;
    	var xhr = new XMLHttpRequest();
    	xhr.mozBackgroundRequest = true;
    	var sendData;
    	if (tFileURL.indexOf("http://dl.rutracker.org/forum/dl.php") > -1) {
    		xhr.open("POST", tFileURL, true);
    		sendData = tFileURL.replace(/^.+\b(t=\d+).*$/, "$1");
    		xhr.setRequestHeader("Content-Type", "application/x-www-form-urlencoded");
    		xhr.setRequestHeader("Referer", "http://rutracker.org/forum/viewtopic.php?t=" + tFileURL.replace(/^.+\bt=(\d+).*$/, "$1"));
    	}
    	else {
    		xhr.open("GET", tFileURL, true);
    		sendData = null;
    	}
    	xhr.timeout = 10000;
    	if(!/^file:/.test(tFileURL)) {
    		xhr.channel.loadFlags |= Components.interfaces.nsIRequest.LOAD_BYPASS_CACHE;
    		xhr.channel.QueryInterface(Components.interfaces.nsIHttpChannelInternal)
    											.forceAllowThirdPartyCookie = true;
    	}
    	xhr.responseType = "arraybuffer";
    	xhr.onload = function() {
    		btn.image = imgMain;
    		processTFile(this.response);
    	}
    	xhr.ontimeout = function() {
    		btn.image = imgMain;
    		alert("Timeout");
    	}
    	xhr.onerror = function() {
    		btn.image = imgMain;
    		alert("HTTP error");
    	}
    	xhr.send(sendData);
    }
    


    To begin with, we note that torrent files are written in the BEncode language ( language description and format of torrent files ). It is a bit like JSON in its capabilities. But there is one hitch in it: tags, numbers and lines in it are encoded, as a rule, on UTF-8, however, some lines contain binary data (a simple sequence of bytes that does not obey UTF-8 rules). Therefore, you can’t just treat the entire line as UTF-8, the decoder will lock up and give an error about the wrong UTF-8. When parsing a file, you need to keep this precaution in mind.

    Now a few notes about the request itself and getting the file.

    We start the pulsator image and set the background request flag to simplify the browser. Then we check the address and reassure the protection of the most famous domestic torrent tracker, which prohibits downloading torrent files from other pages of the site (this method, which allows extensions to work with the torrent files of the tracker, was once described by the developers of the site when introducing protection).

    To force Firefox to send cookies with XHR even when the user has disabled cookies from third-party sites, set the force sending flag (without this, cookies are not accepted and are not sent). Of course, flags for forced sending of cookies and cache bypass are necessary only for network protocols, therefore, in the case of the local file protocol, we do not install them.

    Previously, to get the binary from XMLHttpRequest, you had toresort to some sorcery . With the introduction of new response types in XHR, things got easier . Therefore, we will use the type arraybufferand work in the future with typed arrays .

    In the end, we set the handlers for the different outcomes of our request (each time without forgetting to change the pulsator to a regular picture). If successful, we proceed to parse the resulting file.

    3. Processing and output of the torrent file

    functionprocessTFile(tFile) {
    	var byteArray = newUint8Array(tFile);
    	var torrentObject = bdecode(byteArray);
    	if (torrentObject) {
    		if (torrentObject['creation date']) {
    			torrentObject['creation date'] = (newDate(torrentObject['creation date']*1000)).toLocaleString();
    		}
    		if (torrentObject.info) {
    			var files = torrentObject.info.files;
    			if (files && files instanceofArray) {
    				for (var i = 0, file; file = files[i]; i++) {
    					if (file.length) {
    						file.length = Number((file.length / 1024).toFixed(2)).toLocaleString() + " KB";
    					}
    					if (file['path.utf-8']) {
    						file['path.utf-8'] = file['path.utf-8'].join("/");
    					}
    					if (file.path) {
    						file.path = file.path.join("/");
    					}
    				}
    				if (files[0]['path.utf-8']) {
    					files = files.sort(
    						function(o1, o2) {
    							if (o1['path.utf-8'] > o2['path.utf-8']) {return1;}
    							elseif (o1['path.utf-8'] < o2['path.utf-8']) {return-1;}
    							else {return0;}
    						}
    					);
    				}
    				elseif (files[0].path) {
    					files = files.sort(
    						function(o1, o2) {
    							if (o1['path'] > o2['path']) {return1;}
    							elseif (o1['path'] < o2['path']) {return-1;}
    							else {return0;}
    						}
    					);
    				}
    				files.unshift(files.length);
    			}
    			else {
    				if (torrentObject.info.length) {
    					torrentObject.info.length = Number((torrentObject.info.length / 1024).toFixed(2)).toLocaleString() + " KB";
    				}
    			}
    		}
    		if (gBrowser.selectedBrowser.currentURI.spec == "about:blank" && !gBrowser.selectedBrowser.webProgress.isLoadingDocument) {
    			gBrowser.selectedBrowser.loadURI(
    				"data:application/json;charset=utf-8," +
    				encodeURIComponent(JSON.stringify(torrentObject, null, '\t'))
    			);
    		}
    		else {
    			gBrowser.selectedTab = gBrowser.addTab(
    				"data:application/json;charset=utf-8," +
    				encodeURIComponent(JSON.stringify(torrentObject, null, '\t'))
    			);
    		}
    		torrentObject = null;
    	}
    	else {
    		alert("Parsing error");
    	}
    }
    


    Having created an array of bytes, we pass it to the BEncode decoder (about it a bit later) and get from it a regular object (associated array, hash) that copies the structure of the torrent file (and if we don’t get it, we get a parsing error message). Then we bring some data in a more readable form (creation date, file sizes and paths to them), sort the file objects by the path property and insert the total number of files at the top of the file list. After that, we check if a blank page is open in our current tab and if anything is loaded into it. If it is open and does not load, we will display in the current tab, if not, open a new one and display in it.

    We will implement the output in JSON for simplicity and versatility. The output is formatted a bit. But it’s best to install some extension that highlights JSON and allows you to treat it like a tree structure, collapsing and expanding nodes (for example, JSONView ). After the conclusion, just in case, we reset the huge object (if it is not paranoia).

    4. Parser BEncode

    functionbdecode(byteArray, byteIndex, isRawBytes) {
    	if (byteIndex === undefined) {
    		byteIndex = [0];
    	}
    	var item = String.fromCharCode(byteArray[byteIndex[0]++]);
    	if(item == 'd') {
    		var dic = {};
    		item = String.fromCharCode(byteArray[byteIndex[0]++]);
    		while(item != 'e') {
    			byteIndex[0]--;
    			var key = bdecode(byteArray, byteIndex);
    			if (key == "pieces") {
    				dic[key] = bdecode(byteArray, byteIndex, true);
    			}
    			else {
    				dic[key] = bdecode(byteArray, byteIndex);
    			}
    			item = String.fromCharCode(byteArray[byteIndex[0]++]);
    		}
    		return dic;
    	}
    	if(item == 'l') {
    		var list = [];
    		item = String.fromCharCode(byteArray[byteIndex[0]++]);
    		while(item != 'e') {
    			byteIndex[0]--;
    			list.push(bdecode(byteArray, byteIndex));
    			item = String.fromCharCode(byteArray[byteIndex[0]++]);
    		}
    		return list;
    	}
    	if(item == 'i') {
    		var num = '';
    		item = String.fromCharCode(byteArray[byteIndex[0]++]);
    		while(item != 'e') {
    			num += item;
    			item = String.fromCharCode(byteArray[byteIndex[0]++]);
    		}
    		returnNumber(num);
    	}
    	if(/\d/.test(item)) {
    		var num = '';
    		while(/\d/.test(item)) {
    			num += item;
    			item = String.fromCharCode(byteArray[byteIndex[0]++]);
    		}
    		num = Number(num);
    		var line = '';
    		if (isRawBytes) {
    			byteIndex[0] += num;
    			return"[" + num + "]";
    		}
    		else {
    			while(num--) {
    				line += escape(String.fromCharCode(byteArray[byteIndex[0]++]));
    			}
    			try {
    				returndecodeURIComponent(line);
    			}
    			catch(e) {
    				returnunescape(line) + " (?!)";
    			}
    		}
    	}
    	returnnull;
    }
    


    As a basis, I took the parser on Perl , seduced by its brevity and simplicity. At first I tried to turn a typed array into a regular byte array so that I could work with shift, but this implementation worked very slowly (probably due to the constant alteration of a large array). Therefore, we had to introduce an ever-increasing access index, wrapping it in an array (so that it could be passed by reference in recursion).

    The main difference from the original sample is the line parsing block. Firstly, we remove from the output a huge string with bytes containing hashes of segments (it has a clear location, therefore, having reached the desired key in parsing the associated array, we temporarily set the encoding disable flag in the call to parse the value of this key). Secondly, we perform some manipulation of converting bytes to UTF-8 for the rest of the lines. Here we are in danger: sometimes it’s not UTF-8 in the lines (for example, the popular tracker.0day.kiev.ua tracker somehow inserts the word “Tracker” in Windows-1251 encoding in the “source” key) and decodeURIComponent crashes with an error . Therefore, for such cases, we return the raw view to the string, marking it a little bit. It would be possible to delete such lines altogether,

    IV. Prospects


    Based on parsing and obtaining the described information, more complex tasks can be implemented. For example:

    - monitor the update of the torrent file at the same address (checking the contents or creation time) and notify about re-uploading; examples of such checks (regarding web pages, but everything is easily redone) can be found here ;

    - if the file is updated, it can be automatically saved from the browser to a folder on the disk from where it will be picked up by the torrent client (here we may be interested in the interfaces nsIFile , nsILocalFile , nsIFilePicker , nsIFileOutputStream , nsIBinaryOutputStream and sample code ).

    - Since the latest XHR implementations support the file: // protocol, using the button you can also view local torrent files and even client databases (like settings.dat or resume.dat), however, in the latter case there will be many binary strings with croco-zooms. To do this, open the folder with the files in the browser and drag them onto the button.



    The promised link to install the button (if someone doesn’t want to copy the code in parts to the initialization tab): since the habraparser remakes the link via the http protocol, you need to go to this page and click on the link “View torrent files” (of course, after installing Custom Buttons).

    I apologize if I upset someone with the ineptitude of crafts or incorrect wording. I hope at least some of this experience is useful to someone.

    Also popular now: