Writing a Wrapper for FUSE in the Java Native Runtime
In this article, I will tell you how to implement a file system in a user space in Java, without a line of kernel code. I’ll also show how to connect Java and native code without writing C code, while maintaining maximum performance.
Interesting? Welcome to cat!
Before you begin implementing a wrapper, you need to understand what FUSE is.
FUSE (Filesystem in Userspace) - a file system in user space, it allows users to create their own file systems without privileges and without having to rewrite the kernel code. This is achieved by running the file system code in user space, while the FUSE module only provides a bridge for the current kernel interfaces. FUSE was officially included in the main Linux code tree in version 2.6.14.
Those. in fact, by implementing several methods, you can easily create your own file system ( an example of a simple FS ). There are a million applications for this, for example, you can quickly write a file system with the backend for Dropbox or GitHub.
Or, consider such a case, you have a business application where all user files are stored in the database, but the client suddenly needed direct access to the directory on the server where all the files are located. Of course, duplicating files in the database and FS is not the best solution, and here the virtual file system comes to the rescue. You just write your FUSE wrapper, which when accessing files goes to them in the database.
Great, but the FUSE implementation starts with “include the header file <fuse.h>” and your business application is written in Java. Obviously, you need to somehow interact with the native code.
The standard tool is JNI, but it introduces a lot of complexity into the project, especially considering that to implement FUSE we will have to make callbacks from native code to Java classes. Yes, and “write once” actually suffers, although in the case of FUSE this is less important to us.
Actually, if you try to find projects that implement a wrapper for FUSE on JNI, you can find several projects that, however, have not been supported for a long time and provide an API curve.
Another option is the JNA library . JNA (Java Native Access) makes it fairly easy to access native code without using JNI, limiting yourself to writing java code. Everything is quite simple, we declare an interface that corresponds to the native code, we get its implementation through “Native.loadLibrary” and that's all, we use it. A separate plus of JNA is the most detailed documentation. The project is alive and actively developing.
Moreover, there is already an excellent project for FUSE that implements a wrapper in JNA.
However, JNA has certain performance issues. JNA is based on reflection, and the transition from native code with the conversion of all structures to java objects is very expensive. This is not very noticeable if native calls are rare, but this is not the case with the file system. The only way to speed up fuse-jna is to try to read files in large chunks, however this will not always work. For example, when there is no access to the client code, or all files are small - a large number of text files.
Obviously, a library was supposed to appear that combines the performance of JNI and the convenience of JNA.
This is where the JNR (Java Native Runtime) comes in. JNR, like JNA, is based on libffi, but instead of reflection, bytecode generation is used, thereby achieving a huge performance advantage.
There is not much information about JNR, the most detailed is Charles Nutter's presentation at JVMLS 2013 ( presentation ). However, the JNR is already a fairly large ecosystem, which is actively used by JRuby. Many of its parts, for example, unix-sockets, posix-api are also actively used by third-party projects.
That JNR is the basis for the development of JEP 191 - Foreign Function Interface, which targets java 10.
Unlike JNA, JNR does not have any documentation, all the answers to the questions have to be found in the source code, this was the main reason for writing a small guide.
A simple libc binding looks like this:
Through LibraryLoader, we load by name the library that corresponds to the transferred interface.
In the case of FUSE, you need an interface with the fuse_main_real method, into which the FuseOperations structure is passed, which contains all the callbacks.
It is often necessary to work with structures located at a specific address, for example, the fuse_bufvec structure:
For its implementation in JNR, you need to inherit from jnr.ffi.Struct.
Inside each structure is stored a pointer by which it is placed in memory. Most of the APIs for working with structures can be seen by looking at the static methods of Struct.
size_t is the inner class of Struct and when you create it, for each field the offset with which this field is located in memory is remembered, due to which each field knows by what offset it is in memory. Many inner classes have already been implemented (for example, Signed64, Unsigned32, time_t, etc.), you can always implement your own.
There is an annotation for working with callbacks in JNR
Then you can set the desired callback implementation in the getattr field, for example.
Of some non-obvious things, it is also worth noting a wrapper over enum, for this you need to inherit your enum from jnr.ffi.util.EnumMapper.IntegerEnum and implement the intValue method
This knowledge is enough to easily implement a simple cross-platform wrapper over some native library.
Which is exactly what I did with FUSE in my jnr-fuse project . Initially, the fuse-jna library was used, however, it was it that was the botnet in the implementation of the FS. When developing the API, I tried to keep compatibility with fuse-jna as well as with the native implementation (<fuse.h>) as much as possible.
To implement your file system in user space, you need to inherit from ru.serce.jnrfuse.FuseStubFS and implement the necessary methods. Fuse_operations contains many methods , however, in order to get a working FS, it is enough to implement just a few basic ones.
It's quite simple, here are some examples of working FS .
Linux is currently supported (x86 and x64).
The library is in jcenter, in the near future I will add a mirror in maven central.
In my case, FS was read-only and I was interested in specifically throughput. Performance will greatly depend on the implementation of your FS, so if you suddenly use fuse-jna, you can easily connect jnr-fuse, write a test based on your load profile and see the difference. (This test is useful to you anyway, we all love to drive for performance, right?)
To show the order of difference, I transferred the implementation of MemoryFS from fuse-jna to fuse-jnr with minimal changes and ran a fio read test. For the test I used the framework fio , about which not so long ago was a good article on Habré .
The test only demonstrates the difference in the speed of reading a file in fuse-jna and fuse-jnr, however, on its basis you can get an idea about the difference in the speed of JNA and JNR. Those who wish can always write more detailed tests for native calls using JMH , taking into account all the features, I myself would be interested to look at these tests.
The difference in both throughput and latency in JNR and JNA is expected, as in the presentation by Charles Nutter, to be ~ 10 times.
The jnr-fuse project is hosted on GitHub . I’ll be happy with the stars, pool requests, suggestions for improving the project.
I will also be happy to answer all your questions about JNR and jnr-fuse.
Interesting? Welcome to cat!
Before you begin implementing a wrapper, you need to understand what FUSE is.
FUSE (Filesystem in Userspace) - a file system in user space, it allows users to create their own file systems without privileges and without having to rewrite the kernel code. This is achieved by running the file system code in user space, while the FUSE module only provides a bridge for the current kernel interfaces. FUSE was officially included in the main Linux code tree in version 2.6.14.
Those. in fact, by implementing several methods, you can easily create your own file system ( an example of a simple FS ). There are a million applications for this, for example, you can quickly write a file system with the backend for Dropbox or GitHub.
Or, consider such a case, you have a business application where all user files are stored in the database, but the client suddenly needed direct access to the directory on the server where all the files are located. Of course, duplicating files in the database and FS is not the best solution, and here the virtual file system comes to the rescue. You just write your FUSE wrapper, which when accessing files goes to them in the database.
Java and native code
Great, but the FUSE implementation starts with “include the header file <fuse.h>” and your business application is written in Java. Obviously, you need to somehow interact with the native code.
Jni
The standard tool is JNI, but it introduces a lot of complexity into the project, especially considering that to implement FUSE we will have to make callbacks from native code to Java classes. Yes, and “write once” actually suffers, although in the case of FUSE this is less important to us.
Actually, if you try to find projects that implement a wrapper for FUSE on JNI, you can find several projects that, however, have not been supported for a long time and provide an API curve.
Jna
Another option is the JNA library . JNA (Java Native Access) makes it fairly easy to access native code without using JNI, limiting yourself to writing java code. Everything is quite simple, we declare an interface that corresponds to the native code, we get its implementation through “Native.loadLibrary” and that's all, we use it. A separate plus of JNA is the most detailed documentation. The project is alive and actively developing.
Moreover, there is already an excellent project for FUSE that implements a wrapper in JNA.
However, JNA has certain performance issues. JNA is based on reflection, and the transition from native code with the conversion of all structures to java objects is very expensive. This is not very noticeable if native calls are rare, but this is not the case with the file system. The only way to speed up fuse-jna is to try to read files in large chunks, however this will not always work. For example, when there is no access to the client code, or all files are small - a large number of text files.
Obviously, a library was supposed to appear that combines the performance of JNI and the convenience of JNA.
Jnr
This is where the JNR (Java Native Runtime) comes in. JNR, like JNA, is based on libffi, but instead of reflection, bytecode generation is used, thereby achieving a huge performance advantage.
There is not much information about JNR, the most detailed is Charles Nutter's presentation at JVMLS 2013 ( presentation ). However, the JNR is already a fairly large ecosystem, which is actively used by JRuby. Many of its parts, for example, unix-sockets, posix-api are also actively used by third-party projects.
That JNR is the basis for the development of JEP 191 - Foreign Function Interface, which targets java 10.
Unlike JNA, JNR does not have any documentation, all the answers to the questions have to be found in the source code, this was the main reason for writing a small guide.
Code Writing Feature for Java Native Runtime
Function binding
A simple libc binding looks like this:
import jnr.ffi.*;
import jnr.ffi.types.pid_t;
/**
* Gets the process ID of the current process, and that of its parent.
*/publicclassGetpid{
publicinterfaceLibC{
public@pid_t longgetpid();
public@pid_t longgetppid();
}
publicstaticvoidmain(String[] args){
LibC libc = LibraryLoader.create(LibC.class).load("c");
System.out.println("pid=" + libc.getpid() + " parent pid=" + libc.getppid());
}
}
Through LibraryLoader, we load by name the library that corresponds to the transferred interface.
In the case of FUSE, you need an interface with the fuse_main_real method, into which the FuseOperations structure is passed, which contains all the callbacks.
publicinterfaceLibFuse{
intfuse_main_real(int argc, String argv[], FuseOperations op, int op_size, Pointer user_data);
}
Struct implementation
It is often necessary to work with structures located at a specific address, for example, the fuse_bufvec structure:
structfuse_bufvec {size_t count;
size_t idx;
size_t off;
structfuse_bufbuf[1];
};
For its implementation in JNR, you need to inherit from jnr.ffi.Struct.
import jnr.ffi.*;
publicclassFuseBufvecextendsStruct{
publicFuseBufvec(jnr.ffi.Runtime runtime){
super(runtime);
}
publicfinal size_t count = new size_t();
publicfinal size_t idx = new size_t();
publicfinal size_t off = new size_t();
publicfinal FuseBuf buf = inner(new FuseBuf(getRuntime()));
}
Inside each structure is stored a pointer by which it is placed in memory. Most of the APIs for working with structures can be seen by looking at the static methods of Struct.
size_t is the inner class of Struct and when you create it, for each field the offset with which this field is located in memory is remembered, due to which each field knows by what offset it is in memory. Many inner classes have already been implemented (for example, Signed64, Unsigned32, time_t, etc.), you can always implement your own.
Callbacks
structfuse_operations {
int (*getattr) (constchar *, structstat *);
}
There is an annotation for working with callbacks in JNR
@Delegate
publicinterfaceGetAttrCallback{
@Delegateintgetattr(String path, Pointer stbuf);
}
publicclassFuseOperationsextendsStruct{
publicFuseOperations(Runtime runtime){
super(runtime);
}
publicfinal Func<GetAttrCallback> getattr = func(GetAttrCallback.class);
}
Then you can set the desired callback implementation in the getattr field, for example.
fuseOperations.getattr.set((path, stbuf) -> 0);
Enum
Of some non-obvious things, it is also worth noting a wrapper over enum, for this you need to inherit your enum from jnr.ffi.util.EnumMapper.IntegerEnum and implement the intValue method
enumfuse_buf_flags{
FUSE_BUF_IS_FD = (1 << 1),
FUSE_BUF_FD_SEEK = (1 << 2),
FUSE_BUF_FD_RETRY = (1 << 3),
};
publicenum FuseBufFlags implements EnumMapper.IntegerEnum {
FUSE_BUF_IS_FD(1 << 1),
FUSE_BUF_FD_SEEK(1 << 2),
FUSE_BUF_FD_RETRY(1 << 3);
privatefinalint value;
FuseBufFlags(int value) {
this.value = value;
}
@OverridepublicintintValue(){
return value;
}
}
Work with memory
- For direct work with memory, there is a wrapper over the raw jnr.ffi.Pointer pointer
- You can allocate memory using jnr.ffi.Memory
- The starting point for the JNR API is jnr.ffi.Runtime.
This knowledge is enough to easily implement a simple cross-platform wrapper over some native library.
jnr-fuse
Which is exactly what I did with FUSE in my jnr-fuse project . Initially, the fuse-jna library was used, however, it was it that was the botnet in the implementation of the FS. When developing the API, I tried to keep compatibility with fuse-jna as well as with the native implementation (<fuse.h>) as much as possible.
To implement your file system in user space, you need to inherit from ru.serce.jnrfuse.FuseStubFS and implement the necessary methods. Fuse_operations contains many methods , however, in order to get a working FS, it is enough to implement just a few basic ones.
It's quite simple, here are some examples of working FS .
Linux is currently supported (x86 and x64).
The library is in jcenter, in the near future I will add a mirror in maven central.
Gradle
repositories {
jcenter()
}
dependencies {
compile 'com.github.serceman:jnr-fuse:0.1'
}
Maven
<repositories><repository><id>central</id><name>bintray</name><url>http://jcenter.bintray.com</url></repository></repositories><dependencies><dependency><groupId>com.github.serceman</groupId><artifactId>jnr-fuse</artifactId><version>0.1</version></dependency></dependencies>
Compare fuse-jna and jnr-fuse performance
In my case, FS was read-only and I was interested in specifically throughput. Performance will greatly depend on the implementation of your FS, so if you suddenly use fuse-jna, you can easily connect jnr-fuse, write a test based on your load profile and see the difference. (This test is useful to you anyway, we all love to drive for performance, right?)
To show the order of difference, I transferred the implementation of MemoryFS from fuse-jna to fuse-jnr with minimal changes and ran a fio read test. For the test I used the framework fio , about which not so long ago was a good article on Habré .
Test configuration
[readtest]
blocksize=4k
directory=/tmp/mnt/
rw=randread
direct=1
buffered=0
ioengine=libaio
time_based=60
size=16M
runtime=60
blocksize=4k
directory=/tmp/mnt/
rw=randread
direct=1
buffered=0
ioengine=libaio
time_based=60
size=16M
runtime=60
Result fuse-jna
serce@SerCe-FastLinux:~/git/jnr-fuse/bench$ fio read.ini
readtest: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
fio-2.1.3
Starting 1 process
readtest: Laying out IO file(s) (1 file(s) / 16MB)
Jobs: 1 (f=1): [r] [100.0% done] [24492KB/0KB/0KB /s] [6123/0/0 iops] [eta 00m:00s]
readtest: (groupid=0, jobs=1): err= 0: pid=10442: Sun Jun 21 14:49:13 2015
read: io=1580.2MB, bw=26967KB/s, iops=6741, runt= 60000msec
slat (usec): min=46, max=29997, avg=146.55, stdev=327.68
clat (usec): min=0, max=69, avg= 0.47, stdev= 0.66
lat (usec): min=47, max=30002, avg=147.26, stdev=327.88
clat percentiles (usec):
| 1.00th=[ 0], 5.00th=[ 0], 10.00th=[ 0], 20.00th=[ 0],
| 30.00th=[ 0], 40.00th=[ 0], 50.00th=[ 0], 60.00th=[ 1],
| 70.00th=[ 1], 80.00th=[ 1], 90.00th=[ 1], 95.00th=[ 1],
| 99.00th=[ 2], 99.50th=[ 2], 99.90th=[ 3], 99.95th=[ 12],
| 99.99th=[ 14]
bw (KB /s): min=17680, max=32606, per=96.09%, avg=25913.26, stdev=3156.20
lat (usec): 2=97.95%, 4=1.96%, 10=0.02%, 20=0.06%, 50=0.01%
lat (usec): 100=0.01%
cpu: usr=1.98%, sys=5.94%, ctx=405302, majf=0, minf=28
IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued: total=r=404511/w=0/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
READ: io=1580.2MB, aggrb=26967KB/s, minb=26967KB/s, maxb=26967KB/s, mint=60000msec, maxt=60000msec
readtest: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
fio-2.1.3
Starting 1 process
readtest: Laying out IO file(s) (1 file(s) / 16MB)
Jobs: 1 (f=1): [r] [100.0% done] [24492KB/0KB/0KB /s] [6123/0/0 iops] [eta 00m:00s]
readtest: (groupid=0, jobs=1): err= 0: pid=10442: Sun Jun 21 14:49:13 2015
read: io=1580.2MB, bw=26967KB/s, iops=6741, runt= 60000msec
slat (usec): min=46, max=29997, avg=146.55, stdev=327.68
clat (usec): min=0, max=69, avg= 0.47, stdev= 0.66
lat (usec): min=47, max=30002, avg=147.26, stdev=327.88
clat percentiles (usec):
| 1.00th=[ 0], 5.00th=[ 0], 10.00th=[ 0], 20.00th=[ 0],
| 30.00th=[ 0], 40.00th=[ 0], 50.00th=[ 0], 60.00th=[ 1],
| 70.00th=[ 1], 80.00th=[ 1], 90.00th=[ 1], 95.00th=[ 1],
| 99.00th=[ 2], 99.50th=[ 2], 99.90th=[ 3], 99.95th=[ 12],
| 99.99th=[ 14]
bw (KB /s): min=17680, max=32606, per=96.09%, avg=25913.26, stdev=3156.20
lat (usec): 2=97.95%, 4=1.96%, 10=0.02%, 20=0.06%, 50=0.01%
lat (usec): 100=0.01%
cpu: usr=1.98%, sys=5.94%, ctx=405302, majf=0, minf=28
IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued: total=r=404511/w=0/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
READ: io=1580.2MB, aggrb=26967KB/s, minb=26967KB/s, maxb=26967KB/s, mint=60000msec, maxt=60000msec
Jnr-fuse result
serce@SerCe-FastLinux:~/git/jnr-fuse/bench$ fio read.ini
readtest: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
fio-2.1.3
Starting 1 process
readtest: Laying out IO file(s) (1 file(s) / 16MB)
Jobs: 1 (f=1): [r] [100.0% done] [208.5MB/0KB/0KB /s] [53.4K/0/0 iops] [eta 00m:00s]
readtest: (groupid=0, jobs=1): err= 0: pid=10153: Sun Jun 21 14:45:17 2015
read: io=13826MB, bw=235955KB/s, iops=58988, runt= 60002msec
slat (usec): min=6, max=23671, avg=15.80, stdev=19.97
clat (usec): min=0, max=1028, avg= 0.37, stdev= 0.78
lat (usec): min=7, max=23688, avg=16.29, stdev=20.03
clat percentiles (usec):
| 1.00th=[ 0], 5.00th=[ 0], 10.00th=[ 0], 20.00th=[ 0],
| 30.00th=[ 0], 40.00th=[ 0], 50.00th=[ 0], 60.00th=[ 0],
| 70.00th=[ 1], 80.00th=[ 1], 90.00th=[ 1], 95.00th=[ 1],
| 99.00th=[ 1], 99.50th=[ 1], 99.90th=[ 2], 99.95th=[ 2],
| 99.99th=[ 10]
lat (usec): 2=99.88%, 4=0.10%, 10=0.01%, 20=0.01%, 50=0.01%
lat (usec): 100=0.01%, 250=0.01%
lat (msec): 2=0.01%
cpu: usr=9.33%, sys=34.01%, ctx=3543137, majf=0, minf=28
IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued: total=r=3539449/w=0/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
READ: io=13826MB, aggrb=235955KB/s, minb=235955KB/s, maxb=235955KB/s, mint=60002msec, maxt=60002msec
readtest: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
fio-2.1.3
Starting 1 process
readtest: Laying out IO file(s) (1 file(s) / 16MB)
Jobs: 1 (f=1): [r] [100.0% done] [208.5MB/0KB/0KB /s] [53.4K/0/0 iops] [eta 00m:00s]
readtest: (groupid=0, jobs=1): err= 0: pid=10153: Sun Jun 21 14:45:17 2015
read: io=13826MB, bw=235955KB/s, iops=58988, runt= 60002msec
slat (usec): min=6, max=23671, avg=15.80, stdev=19.97
clat (usec): min=0, max=1028, avg= 0.37, stdev= 0.78
lat (usec): min=7, max=23688, avg=16.29, stdev=20.03
clat percentiles (usec):
| 1.00th=[ 0], 5.00th=[ 0], 10.00th=[ 0], 20.00th=[ 0],
| 30.00th=[ 0], 40.00th=[ 0], 50.00th=[ 0], 60.00th=[ 0],
| 70.00th=[ 1], 80.00th=[ 1], 90.00th=[ 1], 95.00th=[ 1],
| 99.00th=[ 1], 99.50th=[ 1], 99.90th=[ 2], 99.95th=[ 2],
| 99.99th=[ 10]
lat (usec): 2=99.88%, 4=0.10%, 10=0.01%, 20=0.01%, 50=0.01%
lat (usec): 100=0.01%, 250=0.01%
lat (msec): 2=0.01%
cpu: usr=9.33%, sys=34.01%, ctx=3543137, majf=0, minf=28
IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued: total=r=3539449/w=0/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
READ: io=13826MB, aggrb=235955KB/s, minb=235955KB/s, maxb=235955KB/s, mint=60002msec, maxt=60002msec
The test only demonstrates the difference in the speed of reading a file in fuse-jna and fuse-jnr, however, on its basis you can get an idea about the difference in the speed of JNA and JNR. Those who wish can always write more detailed tests for native calls using JMH , taking into account all the features, I myself would be interested to look at these tests.
The difference in both throughput and latency in JNR and JNA is expected, as in the presentation by Charles Nutter, to be ~ 10 times.
References
- Fuse on sourceforge
- JNR on github
- Presentation by Charles Nutter about JNR
- Jep 191
- hello-fuse in java / hello-fuse in C
The jnr-fuse project is hosted on GitHub . I’ll be happy with the stars, pool requests, suggestions for improving the project.
I will also be happy to answer all your questions about JNR and jnr-fuse.