Reverse-engineering of USB device drivers using an example of a radio-controlled machine

    Translation of the article DRIVE IT YOURSELF: USB CAR

    image

    One of the arguments of Windows lovers over Linux lovers is the lack of drivers for equipment for this OS. Over time, the situation is getting better. Now it is already much better than 10 years ago. But sometimes you can meet some kind of device that is not recognized by your favorite distribution. Usually it will be some kind of USB peripherals.

    The beauty of free software is that you can solve this problem yourself (if you are a programmer). Of course, it all depends on the complexity of the equipment. With a three-dimensional webcam, you may not succeed - but many USB devices are quite simple, and you do not have to dive into the depths of the core or dig in C. In this tutorial, we will use Python to make a step-by-step driver for a toy radio-controlled car .

    The process will essentially be reverse engineering. First, we will study the device in detail, then save the data that it exchanges with the driver in Windows, and try to understand what they mean. For non-trivial protocols, you may need both experience and luck.

    Introducing USB


    USB - bus with host control. The host (PC) decides which device sends the data over the wire, and when exactly. Even in the case of an asynchronous event (pressing a button on the keyboard), it is not sent to the host immediately. Since there can be up to 127 devices on each bus (and even more if through hubs), this operation scheme facilitates management.

    USB also has a multi-layer protocol system - much like the Internet. The lowest level is usually implemented in silicon. The transport layer works through tunnels (pipe). Stream tunnels transmit various data, message tunnels - messages for managing devices. Each device supports at least one message tunnel. At the highest level of the application (or class), there are protocols like USB Mass Storage (flash drives) or Human Interface Devices (HID), devices for human-computer interaction.

    In wires


    A USB device can be thought of as a set of endpoints, or I / O buffers. Each has a data direction (input or output) and type of transmission. The types of buffers are as follows: interrupts, isochronous, control and packet.

    Interrupts transmit data a little bit in real time. If the user presses a key, the device waits until the host asks “have the buttons pressed there?”. The host should not slow down, and these events should not be lost. Isochronous work in much the same way, but not so hard - they allow you to transfer more data, while allowing them to be lost when it is not critical (for example, webcams).

    Batch designed for large volumes. So that they do not clog the channel, they are given all the space that is not currently occupied by other data. Managers are used to control devices, and only they have a rigidly defined format of requests and responses. A set of buffers with associated metadata is called an interface.

    Any USB device has a buffer number zero - this is the default endpoint of the tunnel that is used for control data. But how does the host know how many buffers the device has and what type they are? For this, different descriptors are used, sent by special requests through the default tunnel. They can be standard for all, special for specific classes of devices, or proprietary.

    Descriptors make up a hierarchy that can be viewed with utilities like lsusb. At the top is the device descriptor, which contains the Vendor ID (VID) and Product ID (PID). This pair is unique for each device, on it the system finds the right driver. A device can have several configurations, each with its own interface (for example, a printer, scanner, and fax in an MFP). But usually one configuration with one interface is defined. They are described by corresponding descriptors. Each endpoint has a descriptor containing its address (number), direction (input or output), and transmission type.

    Class specifications have their own types of descriptors. The USB HID specification expects data transfer in the form of “reports” that are sent and received via the control buffer or interrupts. These descriptors determine the format of the report (for example, “1 field 8 bits long”) and how it should be used (“offset in the X direction”). Therefore, the HID device describes itself and can be supported by a universal driver (usbhid on Linux). Otherwise, I would have to write my own driver for each mouse.

    I will not try to describe hundreds of pages of specifications in a few paragraphs. Interested in sending to O'Reilly's book "USB in a Nutshell", free of charge at www.beyondlogic.org/usbnutshell . Let’s do it better.

    Understanding Permissions


    By default, USB devices can only be accessed from within the root. In order not to run the test program in this way, add the udev rule:

    SUBSYSTEM=="usb", ATTRS{idVendor}=="0a81", ATTRS{idProduct}=="0702", GROUP="INSERT_HERE", MODE="0660"
    


    Insert the name of the group your user belongs to and add this to /lib/udev/rules.d/99-usbcar.rules.

    Under the hood


    Let's see what the machine looks like via USB. lsusb is a tool for counting devices and decoding their descriptors. Included with usbutils.

    [val@y550p ~]$ lsusb
    Bus 002 Device 036: ID 0a81:0702 Chesen Electronics Corp.
    Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
    ...
    


    The machine is Device 036 (to be sure, you can disconnect it and run lsusb again). The ID field is a pair of VID: PID. To read the descriptors, run lsusb -v:

    [val@y550p ~]$ lsusb -vd 0a81:0702
    Bus 002 Device 036: ID 0a81:0702 Chesen Electronics Corp.
    Device Descriptor:
    idVendor 0x0a81 Chesen Electronics Corp.
    idProduct 0x0702
    ...
    bNumConfigurations 1
    Configuration Descriptor:
    ...
    Interface Descriptor:
    ...
    bInterfaceClass 3 Human Interface Device
    ...
    iInterface 0
    HID Device Descriptor:
    ...
    Report Descriptors:
    ** UNAVAILABLE **
    Endpoint Descriptor:
    ...
    bEndpointAddress 0x81 EP 1 IN
    bmAttributes 3
    Transfer Type Interrupt
    ...
    


    Standard hierarchy. Like most devices, it has only one configuration and interface. You may notice one endpoint interrupt-in (except for the default point 0, which is always there and therefore is not displayed in the list). The bInterfaceClass field reports that this is a HID device. This is good - the protocol for communicating with HID is open. It would seem that we read the descriptor of reports to understand their format and use, and the point is in the hat. However, it is marked ** UNAVAILABLE **. FAQ? Since the machine is a HID device, the usbhid driver has appropriated it to itself, but does not know what to do with it. It is necessary to untie him from managing it.

    First you need to find the bus address. Reconnect it, run dmesg | grep usb, and look at the last line starting with usb XY.Z :. X, Y, and Z are integers that uniquely identify the ports on the host. Then run

    [root@y550p ~]# echo -n X-Y.Z:1.0 > /sys/bus/usb/drivers/usbhid/unbind
    


    1.0 is the configuration and interface that the usbhid driver should release. To tie everything back, write the same thing in / sys / bus / usb / drivers / usbhid / bind.

    Now the Report descriptor field returns information:

    Report Descriptor: (length is 52)
    Item(Global): Usage Page, data= [ 0xa0 0xff ] 65440
    (null)
    Item(Local ): Usage, data= [ 0x01 ] 1
    (null)
    ...
    Item(Global): Report Size, data= [ 0x08 ] 8
    Item(Global): Report Count, data= [ 0x01 ] 1
    Item(Main ): Input, data= [ 0x02 ] 2
    ...
    Item(Global): Report Size, data= [ 0x08 ] 8
    Item(Global): Report Count, data= [ 0x01 ] 1
    Item(Main ): Output, data= [ 0x02 ] 2
    ...
    


    Two reports are given. One reads from the device (input), the second writes (output). Both are byte size. However, their use is not obvious. For comparison, this is what the report descriptor for the mouse looks like (not all, but the main lines):

    Report Descriptor: (length is 75)
    Item(Global): Usage Page, data= [ 0x01 ] 1
    Generic Desktop Controls
    Item(Local ): Usage, data= [ 0x02 ] 2
    Mouse
    Item(Local ): Usage, data= [ 0x01 ] 1
    Pointer
    Item(Global): Usage Page, data= [ 0x09 ] 9
    Buttons
    Item(Local ): Usage Minimum, data= [ 0x01 ] 1
    Button 1 (Primary)
    Item(Local ): Usage Maximum, data= [ 0x05 ] 5
    Button 5
    Item(Global): Report Count, data= [ 0x05 ] 5
    Item(Global): Report Size, data= [ 0x01 ] 1
    Item(Main ): Input, data= [ 0x02 ] 2
    


    Everything is clear here. With a typewriter - it is not clear, and we need to guess about using bits ourselves.

    Small bonus


    Most radio-controlled toys are simple and use standard receivers operating at the same frequencies. So, our program can be used to control other toys except this machine.

    Work for the detective


    When analyzing network traffic, a sniffer is used. And in our case, such a thing will come in handy. There are special USB monitors for commercial use, but Wireshark is also suitable for our task.

    Set up USB interception in Wireshark. First enable USB monitoring in the kernel. Download the usbmon module:

    [root@y550p ~]# modprobe usbmon
    


    Mount the special debugfs file system:

    [root@y550p ~]# mount -t debugfs none /sys/kernel/debug
    


    The / sys / kernel / debug / usb / usbmon directory will appear, which can be used to record traffic using simple shell tools:

    [root@y550p ~]# ls /sys/kernel/debug/usb/usbmon
    0s 0u 1s 1t 1u 2s 2t 2u
    


    There are files with cryptic names. Integer - bus number (the first part of the USB bus address); 0 means all buses on the host. s - statistics, t - transfers, u - URBs (USB Request Blocks logical entities representing ongoing transactions). To save all transfers on bus 2, enter:

    [root@y550p ~]# cat /sys/kernel/debug/usb/usbmon/2t
    ffff88007d57cb40 296194404 S Ii:036:01 -115 1 <
    ffff88007d57cb40 296195649 C Ii:036:01 0 1 = 05
    ffff8800446d4840 298081925 S Co:036:00 s 21 09 0200 0000 0001 1 = 01
    ffff8800446d4840 298082240 C Co:036:00 0 1 >
    ffff880114fd1780 298214432 S Co:036:00 s 21 09 0200 0000 0001 1 = 00
    


    For an untrained eye, nothing is clear here. Good thing Wireshark will decode the data.

    Now we need Windows that will work with the original driver. It’s best to install everything in VirtualBox (with the Oracle Extension Pack, since we need USB support). Make sure VirtualBox can use the device, and run KeUsbCar, which controls the machine in Windows. Launch Wireshark to see which commands the driver sends to the device. On the first screen, select the usbmonX interface, where X is the bus to which the machine is connected. If Wireshark does not start from outside the root, make sure that the / dev / usbmon * nodes have the appropriate permissions.

    image

    Click the Forward button in KeUsbCar. Wireshark will intercept several outgoing control packets. The screenshot shows the one we need. Judging by the parameters, this is a SET_REPORT request (bmRequestType = 0x21, bRequest = 0x09), which is usually used to change the status of a device - such as light bulbs on the keyboard. According to the Report Descriptor that we saw, the data length is 1 byte, and the report itself contains 0x01 (also highlighted).

    Pressing the Right button results in a similar request. But the report already contains 0x02. One can guess that this means the direction of movement. In the same way, we find out that 0x04 is the right reverse, 0x08 is the reverse, etc. The rule is simple: the direction code is a binary unit shifted to the left by the button position in the KeUsbCar interface, if you count them clockwise.

    You can also note periodic interrupt requests from Endpoint 1 (0x81, 0x80 means that this is an entry point; 0x01 is its address). What is it? In addition to the buttons, the KeUsbCar has a charge indicator, so this may be battery information. Their value does not change (0x05) if the car does not leave the garage. Otherwise, interrupt requests do not occur, but they are resumed if we put it back. Then, apparently, 0x05 means “charging” (the toy is simple, so the charge level is not transmitted). When the battery is charged, the interrupt will start returning 0x85 (0x05 with 7 bit set). Apparently, 7 bits means "charged." What bits 0 and 2 do, which make up 0x05, is not yet clear.

    We write almost a real driver


    Making the program work with a device that was not previously supported is good, but sometimes you need to make the rest of the system work with it. This means that you need to make a driver, and this requires programming at the kernel level (http://www.linuxvoice.com/be-a-kernel-hacker/), and you hardly need it now. But perhaps we can do without it, if we are talking about USB.

    If you have a USB network card, you can use TUN / TAP to connect the PyUSB program to the Linux network stack. TUN / TAP interfaces work like regular network interfaces, with names like tun0 or tap1, but through them all packages become available in the / dev / net / tun node. The pytun module makes working with TUN / TAP simple. Performance suffers, but you can rewrite a C program using libusb.

    Another candidate is a USB display. Linux has a vfb module that allows you to access the framebuffer like / dev / fbX. You can use ioctls to redirect the console to it, and upload the contents of / dev / fbX to a USB device. This is also not fast, but you are not going to play 3D shooters via USB.

    Writing a code


    Let's make the same program as for Windows. 6 arrows and a charge level that flashes when the machine is charging. The code is on Github github.com/vsinitsyn/usbcar.py

    How do we work in USB under Linux? This can be done from user space using the libusb library. It is written in C and requires good knowledge of USB. A simple alternative is PyUSB. For the user interface, I used PyGame.

    Download PyUSB sources from github.com/walac/pyusb , and install via setup.py. You will also need to install the libusb library. We put the functionality to control the machine in a class with the original name USBCar.

    import usb.core
    import usb.util
    class USBCar(object):
      VID = 0x0a81
      PID = 0x0702
      FORWARD = 1
      RIGHT = 2
      REVERSE_RIGHT = 4
      REVERSE = 8
      REVERSE_LEFT = 16
      LEFT = 32
      STOP = 0
    


    We import the two main PyUSB modules and insert the values ​​for controlling the machine, which we calculated when viewing the traffic. VID and PID are machine id taken from lsusb output.

    def __init__(self):
      self._had_driver = False
      self._dev = usb.core.find(idVendor=USBCar.VID, idProduct=USBCar.PID)
      if self._dev is None:
        raise ValueError("Device not found")
    


    The usb.core.find () function searches for a device by its ID. See github.com/walac/pyusb/blob/master/docs/tutorial.rst for details

      if self._dev.is_kernel_driver_active(0):
        self._dev.detach_kernel_driver(0)
        self._had_driver = True
    


    We untie the kernel driver, as we did with lsusb. 0 - interface number. Upon exiting the program, it must be bound back through release (), if it was active. Therefore, we remember the initial state in self._had_driver.

      self._dev.set_configuration()
    


    Run the configuration. This code is equivalent to the following code that PyUSB hides from the programmer:

      self._dev.set_configuration(1)
      usb.util.claim_interface(0)
    def release(self):
      usb.util.release_interface(self._dev, 0)
      if self._had_driver:
        self._dev.attach_kernel_driver(0)
    


    This method must be called before the program terminates. We release the used interface and attach the kernel driver back.

    Movement of the machine:

    def move(self, direction):
      ret = self._dev.ctrl_transfer(0x21, 0x09, 0x0200, 0, [direction])
      return ret == 1
    


    direction is one of the values ​​defined at the beginning of the class. ctrl_transfer () passes control commands. Data is transmitted as a string or as a list. The method returns the number of bytes written. Since we have only one byte, we will return True in this case, and False otherwise.

    Method for battery status:

    def battery_status(self):
      try:
        ret = self._dev.read(0x81, 1, timeout=self.READ_TIMEOUT)
        if ret:
          res = ret.tolist()
          if res[0] == 0x05:
            return 'charging'
          elif res[0] == 0x85:
            return 'charged'
        return 'unknown'
      except usb.core.USBError:
        return 'out of the garage'
    


    The read () method accepts the endpoint address and the number of bytes to read. The type of transfer is determined by the endpoint and stored in the descriptor. We also set non-standard timeout times for the program to work faster. Device.read () returns an array that we convert to a list. We check its first byte to determine the charging status. If the machine is not in the garage, then the read () call will fail, and it will throw usb.core.USBError error. We assume that this error is precisely because of this. In other cases, we return the status of 'unknown'.

    The UI class encapsulates the user interface. Let's go over the main things. The main loop is in UI.main_loop (). We set the background with a picture, show the charge level if the machine is in the garage, and draw the control buttons. Then we wait for the event - if it is a click, then we move the machine in a given direction through USBCar.move ().

    The entire program, including the GUI, takes up a little over 200 lines. Not bad for a device without documentation.

    Of course, we specifically took a fairly simple device. But in the world there are quite a few similar devices to our devices, and many use protocols that are not very different from what we picked up. Reverse engineering a complex device is not an easy task, but now you can add some trinkets to Linux support, such as a device that reports received e-mail. If this is not very useful, then at least it is interesting.

    Also popular now: