Work with C-unions in Rust FFI

I bring to your attention a translation of the article " Working with C unions in Rust FFI " by Herman J. Radtke III.

Note: This article assumes that the reader is familiar with Rust FFI , byte order (endianess), and ioctl .


When creating binders for C code, we will inevitably come across a structure that contains a union. Rust does not have built-in support for joins, so we will have to work out a strategy on our own. In C, a union is a type that stores different types of data in the same memory area. There are many reasons why combining can be preferred, such as: conversion between binary representations of integers and floating-point numbers, implementation of pseudo-polymorphism, and direct access to bits. I will focus on pseudo-polymorphism.

As an example, let's get a MAC address based on the interface name. We list the steps necessary to obtain it:

  • Specify the type of request that will be used with ioctl. If I want to get the MAC (or hardware) address, I specify SIOCGIFHWADDR.
  • Write the name of the interface (something like eth0) in ifr_name.
  • Make a request using ioctl. As a result of a successful request, the data will be written to ifr_ifru.

If you are interested in the details about obtaining a MAC address, see this instruction .

We need to use the function declared in C ioctl and pass the ifreq structure there. Looking at /usr/include/net/if.h, we see that ifreq is defined as follows:

struct  ifreq {
        char    ifr_name[IFNAMSIZ];
        union {
                struct  sockaddr ifru_addr;
                struct  sockaddr ifru_dstaddr;
                struct  sockaddr ifru_broadaddr;
                short   ifru_flags;
                int     ifru_metric;
                int     ifru_mtu;
                int     ifru_phys;
                int     ifru_media;
                int     ifru_intval;
                caddr_t ifru_data;
                struct  ifdevmtu ifru_devmtu;
                struct  ifkpi   ifru_kpi;
                u_int32_t ifru_wake_flags;
                u_int32_t ifru_route_refcnt;
                int     ifru_cap[2];
        } ifr_ifru;
}

Difficulties arise with the union ifr_ifru. Looking at the possible types in ifr_ifru, we see that not all of them are the same size. short takes two bytes, and u_int32_t four. Several structures of unknown size complicate the situation even more. To write the correct code in Rust, it is important to find out the exact size of the ifreq structure. I created a small C program and found out that ifreq uses 16 bytes for ifr_name and 24 bytes for ifr_ifru.

Armed with knowledge of the correct size of the structure, we can begin to represent it in Rust. One strategy is to create a specialized structure for all types of associations.

#[repr(C)]
pub struct IfReqShort {
    ifr_name: [c_char; 16],
    ifru_flags: c_short,
}

We can use IfReqShort to query SIOCGIFINDEX. This structure is smaller than the ifreq structure in C. Although we assume that only 2 bytes will be written, the external ioctl interface expects 24 bytes. For security, let's add 22 bytes of padding at the end:

#[repr(C)]
pub struct IfReqShort {
    ifr_name: [c_char; 16],
    ifru_flags: c_short,
    _padding: [u8; 22],
}

Then we will have to repeat this process for each type in the union. I find this somewhat tedious, since we have to create many structures and be very careful not to make a mistake with their size. Another way to represent a union is to have a raw byte buffer. We can make a single representation of the ifreq structure in Rust as follows:

#[repr(C)]
pub struct IfReq {
    ifr_name: [c_char; 16],
    union: [u8; 24],
}

This union buffer can hold any type of byte. Now we can define methods for converting raw bytes to the desired type. We will avoid using unsafe code by refusing to use transmute. Let's create a method to get the MAC address by converting raw bytes to sockaddr C-type.

impl IfReq {
    pub fn ifr_hwaddr(&self) -> sockaddr {
        let mut s = sockaddr {
            sa_family: u16::from_be((self.data[0] as u16) << 8 | (self.data[1] as u16)),
            sa_data: [0; 14],
        };
        // basically a memcpy
        for (i, b) in self.data[2..16].iter().enumerate() {
            s.sa_data[i] = *b as i8;
        }
        s
    }
}

This approach leaves us with one structure and method for converting raw bytes to the desired type. Looking again at our ifr_ifru union, we find that there are at least two other requests that also require the creation of sockaddr from raw bytes. Using the DRY principle, we can implement the private IfReq method to convert raw bytes to sockaddr. However, we can do better by abstracting the details of creating sockaddr, short, int, etc. from IfReq. All we need is to tell the association that we need a certain type. Let's create an IfReqUnion for this:

#[repr(C)]
struct IfReqUnion {
    data: [u8; 24],
}
impl IfReqUnion {
    fn as_sockaddr(&self) -> sockaddr {
        let mut s = sockaddr {
            sa_family: u16::from_be((self.data[0] as u16) << 8 | (self.data[1] as u16)),
            sa_data: [0; 14],
        };
        // basically a memcpy
        for (i, b) in self.data[2..16].iter().enumerate() {
            s.sa_data[i] = *b as i8;
        }
        s
    }
    fn as_int(&self) -> c_int {
        c_int::from_be((self.data[0] as c_int) << 24 |
                       (self.data[1] as c_int) << 16 |
                       (self.data[2] as c_int) <<  8 |
                       (self.data[3] as c_int))
    }
    fn as_short(&self) -> c_short {
        c_short::from_be((self.data[0] as c_short) << 8 |
                         (self.data[1] as c_short))
    }
}

We have implemented methods for each of the types that make up the union. Now that our conversions are controlled by IfReqUnion, we can implement the IfReq methods as follows:

#[repr(C)]
pub struct IfReq {
    ifr_name: [c_char; IFNAMESIZE],
    union: IfReqUnion,
}
impl IfReq {
    pub fn ifr_hwaddr(&self) -> sockaddr {
        self.union.as_sockaddr()
    }
    pub fn ifr_dstaddr(&self) -> sockaddr {
        self.union.as_sockaddr()
    }
    pub fn ifr_broadaddr(&self) -> sockaddr {
        self.union.as_sockaddr()
    }
    pub fn ifr_ifindex(&self) -> c_int {
        self.union.as_int()
    }
    pub fn ifr_media(&self) -> c_int {
        self.union.as_int()
    }
    pub fn ifr_flags(&self) -> c_short {
        self.union.as_short()
    }
}

As a result, we have two structures. First, IfReq, which represents the ifreq memory structure in C. In it, we implement a method for each type of ioctl request. Secondly, we have IfRequnion, which manages the various types of ifr_ifru union. We will create a method for each type that we need. This is less time consuming than creating a specialized structure for each type of union, and provides a better interface than type conversion in IfReq itself.

Here is a more complete ready-made example . There is still some work to be done, but the tests pass, and the concept described above is implemented in the code.

Be careful, this approach is not ideal. In the case of ifreq, we are lucky that ifr_name contains 16 bytes and is aligned on the word boundary. If ifr_name were not aligned to the four-byte word boundary, we would run into a problem. Type of our association [u8; 24], which is aligned on the boundary of one byte. A type of 24 bytes would have a different alignment. Here is a short example illustrating the problem. Suppose we have a C-structure containing the following union:

struct foo {
    short x;
    union {
        int;
    } y;
}

This structure is 8 bytes in size. Two bytes for x, two more for alignment, and four bytes for y. Let's try to portray this in Rust:

#[repr(C)]
pub struct Foo {
    x: u16,
    y: [u8; 4],
}

The Foo structure is only 6 bytes in size: two bytes for x and the first two u8 elements placed in the same four-byte word as x. This subtle difference can cause problems when passing to the C function, which expects an 8-byte size structure.

Until Rust supports joins, such problems will be difficult to solve correctly. Good luck, but be careful!

Also popular now: