Using apr_socket_sendfile () from servlets under Tomcat
In this topic, I’ll talk about a small but effective way to transfer files to a user from a servlet using the HTTP protocol. Used by:
Of course, servlet file rendering is not very good in terms of performance. First, giving static content is best without any scripts at all. But sometimes you can’t do without it. Secondly, the return of data comes down, most often, to something like this:
After reading books on NIO and using a microscope, you can redo this into a slightly more effective tool:
However, those who carefully studied the stack trace of their server, know that Tomcat's OutputStream does not support transmission via channels, and it all comes down to the first example, but already in the bowels of the JVM.
The obvious disadvantages of this approach are:
However, Apache Tomcat allows servlets to use (via a convenient interface) the apr_socket_sendfile function from the Apache Portable Runtime library. This function accepts an input pointer to a socket, to a file, as well as start and length parameters of the transmitted data (you can transfer not only the entire file). Access to this functionality is done through the use of request attributes (HttpServletRequest). Check for this functionality:
Now if:
Instead of transferring the file yourself, you can instruct Apache Tomcat:
The second limitation is that the file transfer process will begin after we finish the work in the servlet. The third - with which it is not clear, but perhaps with the fact that I have a 32-bit JVM and 32-bit Gentoo on a test machine (Tomcat did not want to give the file more than 2 GB myself).
As a result:
Of course, for the production system, you need to not only be able to give the whole file, but also in parts, and also take into account the possibility that the user already has the file (process NotModifiedSince).
For further study
- Apache tomcat
- Apache Portable Runtime Library
- Apache Tomcat Native Library
- Your servlet that needs to give files to the user
Of course, servlet file rendering is not very good in terms of performance. First, giving static content is best without any scripts at all. But sometimes you can’t do without it. Secondly, the return of data comes down, most often, to something like this:
- long writed = 0;
- byte[] buffer = new byte[BUFFER_LENGTH];
- int readed = in.read(buffer, 0, BUFFER_LENGTH);
- while (readed != -1) {
- out.write(buffer, 0, readed);
- writed += readed;
- readed = in.read(buffer, 0, BUFFER_LENGTH);
- }
* This source code was highlighted with Source Code Highlighter.
After reading books on NIO and using a microscope, you can redo this into a slightly more effective tool:
- public static long transfer(File file, OutputStream out) throws IOException {
- return transfer(file, 0, file.length(), out);
- }
-
- public static long transfer(File file, long position, long count,
- OutputStream out) throws IOException {
- FileChannel in = new FileInputStream(file).getChannel();
- try {
- long writed = in.transferTo(position, count, Channels
- .newChannel(out));
- return writed;
- } finally {
- in.close();
- }
- }
* This source code was highlighted with Source Code Highlighter.
However, those who carefully studied the stack trace of their server, know that Tomcat's OutputStream does not support transmission via channels, and it all comes down to the first example, but already in the bowels of the JVM.
The obvious disadvantages of this approach are:
- Performance. The code written in Java in this case will obviously be slower than the native code if it could copy directly from the file to OutputStream
- Memory usage. It’s hard for developers to hold on and not wrap each stream in a couple of other Buffered (Input | Output) Stream. It turns out that each piece of the file will in turn visit three or four places of our RAM (remember the disk cache of the operating system and, most likely, some TCP / IP cache)
- The code actively uses processor resources to copy pieces of data back and forth
However, Apache Tomcat allows servlets to use (via a convenient interface) the apr_socket_sendfile function from the Apache Portable Runtime library. This function accepts an input pointer to a socket, to a file, as well as start and length parameters of the transmitted data (you can transfer not only the entire file). Access to this functionality is done through the use of request attributes (HttpServletRequest). Check for this functionality:
- private static final String TOMCAT_SENDFILE_SUPPORT = "org.apache.tomcat.sendfile.support";
-
- final boolean sendFileSupport = Boolean.TRUE.equals(request
- .getAttribute(TOMCAT_SENDFILE_SUPPORT));
* This source code was highlighted with Source Code Highlighter.
Now if:
- sendFileSupport == true
- The file will not be deleted immediately after code execution
- File size less than 2 GB
Instead of transferring the file yourself, you can instruct Apache Tomcat:
- private static final String TOMCAT_SENDFILE_FILENAME = "org.apache.tomcat.sendfile.filename";
- private static final String TOMCAT_SENDFILE_START = "org.apache.tomcat.sendfile.start";
- private static final String TOMCAT_SENDFILE_END = "org.apache.tomcat.sendfile.end";
-
- // using Apache APR and/or NIO to transfer file
- response.setBufferSize(1 << 18);
- request.setAttribute(TOMCAT_SENDFILE_FILENAME, file.getCanonicalPath());
- request.setAttribute(TOMCAT_SENDFILE_START, Long.valueOf(0));
- request.setAttribute(TOMCAT_SENDFILE_END, Long.valueOf(fileLength));
* This source code was highlighted with Source Code Highlighter.
The second limitation is that the file transfer process will begin after we finish the work in the servlet. The third - with which it is not clear, but perhaps with the fact that I have a 32-bit JVM and 32-bit Gentoo on a test machine (Tomcat did not want to give the file more than 2 GB myself).
As a result:
- The number of working Java server threads has decreased by two to three times, since the files are now transferred in separate native threads
- CPU usage has decreased as APR uses operating system features to optimize file transfer
- There is less “garbage” left in the heap, which improves the performance of the Garbage Collector
Of course, for the production system, you need to not only be able to give the whole file, but also in parts, and also take into account the possibility that the user already has the file (process NotModifiedSince).
For further study
- Apache Portable Runtime and Tomcat - about this and other features
- Apache portable runtime
- FileFieldBehaviour - a class in Arp.Site that is responsible for processing file requests, including support for resume and NotModifiedSince