
Making your webfile
For some reason I always wanted to make my own service for downloading files. All kinds of slil / zalil did not suit their speed. ifolder - an abundance of advertising. I used the not very popular (it didn’t even get any worse) service up.spbland.ru . But this is somehow not right. And then I decided to write my service. I will not go into details and routine, just a concept.
Requirements for the resource:
- uploading files using the POST method at least 100 mb, although why not 1000?
- mandatory visualization of the process, i.e. upload progressbar
- the ability to upload files by users (during download)
- to exclude the possibility of installing direct links to files on third-party forums, sites. Only through site.ru/ID
First, I’ll talk about upload progressbar:
While receiving a lot of multipart / form-data information, apache puts everything in / tmp, and after the download is complete (i.e. when the entire request arrives on the web server) it transfers control to php, which the queue does move_uploaded_file . By its nature, php is designed so that process A cannot know at what stage of loading process B is. It is for this that the experimentally invented extension was: uploadprogress. In addition to input type = "file", you need to add input type = "hidden" name = "UPLOAD_IDENTIFIER" value = "some number" to the form, later using this identifier you can use the uploadprogress_get_info () function to track the download status: count -in the received bytes, total bytes, speed and other parameters. In my case, I accessed the script through AJAX once per second and received relevant information.
The tasks you can perform:
- bad
- normal
- cool, planet-wide
Bad - it means accepting files and giving files through php. This path could be improved a little bit with the support of HTTP 206 Partial Content . However, this path is a path for patching holes that does not deserve attention.
Cool:file upload should be handled by Apache, or rather nginx. Firstly, these programs fully support the HTTP / 1.1 specification, and secondly, Apache will always run faster than interpreted php. Nginx has a special X-Accel-Redirect mechanism for uploading files . The bottom line is that nginx is installed as a transparent proxy server. Nginx processes the user’s request for the site.ru/somefiles/123.mp3 file and transfers it to Apache & php. Further, either by nginx or through mod_rewrite, the
request is reduced to /get.php?path=123.mp3
In php, analyzing the ip address of the client, session, etc. a decision is made to issue a file, or to refuse. Next, php passes a specially crafted header:
nginx takes control of itself and already begins to give the file by its own means, making it a million times more efficient than apache. However, this method has its own cant: when uploading a file to the server, nginx completely receives the entire request, and only then transfers it to apache. Until nginx completely receives the whole file, apache will not even know that the download is in progress, which makes upload progress bar impossible.
Conclusion:proper file downloads are done directly through apache, and uploads through nginx. In my case, it would be possible to hang nginx (or apache) on some other port. But to force users stuffed with firewalls to download from a port other than 80 is not comme il faut. Therefore, it would be more correct to hang nginx to a separate address, for example, download-0.site.ru (with this concept, you could add more servers later). But since I was not able to use 2 addresses, I had to refuse the planetary version of file hosting.
However, I still wanted to keep all the tasks I had set earlier.
Okay . And here came to my aid: Rewritemap
Having built a structure like this: I got the opportunity to redirect links of the form
site.ru/100200300/download/22.mp3 to the file / xfiles / 100200300 (where the file is actually located) if the client has an ip address in the iplist.txt file, or redirected to the site.ru/100200300 page with captcha, which if the code is correctly indicated in the picture, it will add the client ip to the iplist.txt file and redirect it again through Location to
site.ru/100200300/download/22.mp3 , but this time the actual file upload will happen.
The iplist.txt file has the following format:
After the lattice, I have a timestamp that tells the special service script when to delete this line. The service script runs once per hour through crontab.
As a result, I got a fairly correct and reliable system. Perhaps its disadvantage is the inefficient use of RAM, because each apache process (with the mpm module prefork) can only serve one client at a time, while the process itself consumes about 10 mb of memory. Assuming that 1000 modems start to download from the server at the same time in 10 threads each, a collapse will occur and there should not be enough memory. But the directive from Httpd.conf MaxClients will save me from this . With nginx, such a problem could not be in principle. In addition, nginx has good functionality to restrict users, for example, the number of simultaneous connections from one address.
Ways to upgrade to planetary hosting:
- switching to nginx as frontend
- creating a distributed system with a server for receiving files in the center
- refuse to store all files in one folder / xfiles /, create another 100 subfolders inside xfiles based on the first 2 characters.
- switch to SAS disks, because on the Internet, this is not an exchange of single heavy files, but the constant download of hundreds of small files (music, photos)
The working version of the described: up.giga.su
PS My first hub
Requirements for the resource:
- uploading files using the POST method at least 100 mb, although why not 1000?
- mandatory visualization of the process, i.e. upload progressbar
- the ability to upload files by users (during download)
- to exclude the possibility of installing direct links to files on third-party forums, sites. Only through site.ru/ID
First, I’ll talk about upload progressbar:
While receiving a lot of multipart / form-data information, apache puts everything in / tmp, and after the download is complete (i.e. when the entire request arrives on the web server) it transfers control to php, which the queue does move_uploaded_file . By its nature, php is designed so that process A cannot know at what stage of loading process B is. It is for this that the experimentally invented extension was: uploadprogress. In addition to input type = "file", you need to add input type = "hidden" name = "UPLOAD_IDENTIFIER" value = "some number" to the form, later using this identifier you can use the uploadprogress_get_info () function to track the download status: count -in the received bytes, total bytes, speed and other parameters. In my case, I accessed the script through AJAX once per second and received relevant information.
The tasks you can perform:
- bad
- normal
- cool, planet-wide
Bad - it means accepting files and giving files through php. This path could be improved a little bit with the support of HTTP 206 Partial Content . However, this path is a path for patching holes that does not deserve attention.
Cool:file upload should be handled by Apache, or rather nginx. Firstly, these programs fully support the HTTP / 1.1 specification, and secondly, Apache will always run faster than interpreted php. Nginx has a special X-Accel-Redirect mechanism for uploading files . The bottom line is that nginx is installed as a transparent proxy server. Nginx processes the user’s request for the site.ru/somefiles/123.mp3 file and transfers it to Apache & php. Further, either by nginx or through mod_rewrite, the
RewriteRule ^/somefiles/(.*)$ /get.php?path=$1 [L]
request is reduced to /get.php?path=123.mp3
In php, analyzing the ip address of the client, session, etc. a decision is made to issue a file, or to refuse. Next, php passes a specially crafted header:
header("X-Accel-Redirect: /files/123.mp3")
nginx takes control of itself and already begins to give the file by its own means, making it a million times more efficient than apache. However, this method has its own cant: when uploading a file to the server, nginx completely receives the entire request, and only then transfers it to apache. Until nginx completely receives the whole file, apache will not even know that the download is in progress, which makes upload progress bar impossible.
Conclusion:proper file downloads are done directly through apache, and uploads through nginx. In my case, it would be possible to hang nginx (or apache) on some other port. But to force users stuffed with firewalls to download from a port other than 80 is not comme il faut. Therefore, it would be more correct to hang nginx to a separate address, for example, download-0.site.ru (with this concept, you could add more servers later). But since I was not able to use 2 addresses, I had to refuse the planetary version of file hosting.
However, I still wanted to keep all the tasks I had set earlier.
Okay . And here came to my aid: Rewritemap
Having built a structure like this: I got the opportunity to redirect links of the form
Rewritemap ipmap txt:/home/anton/iplist.txt
RewriteCond ${ipmap:%{REMOTE_ADDR}} =allow
RewriteRule ^([0-9]+)/download/ /xfiles/$1 [L]
RewriteRule ^([0-9]+)/download/ /$1/ [L,R=302]
RewriteRule ^([0-9]+)/? /view.php?id=$1 [L]
site.ru/100200300/download/22.mp3 to the file / xfiles / 100200300 (where the file is actually located) if the client has an ip address in the iplist.txt file, or redirected to the site.ru/100200300 page with captcha, which if the code is correctly indicated in the picture, it will add the client ip to the iplist.txt file and redirect it again through Location to
site.ru/100200300/download/22.mp3 , but this time the actual file upload will happen.
The iplist.txt file has the following format:
77.1.1.1 allow # 1207100906
After the lattice, I have a timestamp that tells the special service script when to delete this line. The service script runs once per hour through crontab.
As a result, I got a fairly correct and reliable system. Perhaps its disadvantage is the inefficient use of RAM, because each apache process (with the mpm module prefork) can only serve one client at a time, while the process itself consumes about 10 mb of memory. Assuming that 1000 modems start to download from the server at the same time in 10 threads each, a collapse will occur and there should not be enough memory. But the directive from Httpd.conf MaxClients will save me from this . With nginx, such a problem could not be in principle. In addition, nginx has good functionality to restrict users, for example, the number of simultaneous connections from one address.
Ways to upgrade to planetary hosting:
- switching to nginx as frontend
- creating a distributed system with a server for receiving files in the center
- refuse to store all files in one folder / xfiles /, create another 100 subfolders inside xfiles based on the first 2 characters.
- switch to SAS disks, because on the Internet, this is not an exchange of single heavy files, but the constant download of hundreds of small files (music, photos)
The working version of the described: up.giga.su
PS My first hub