Video card driver: so whose bug?

The recent article "A video card driver bug may reveal incognito browsing " prompted the preparation of this material . This article came to light after the publication of a trivial way to display an image belonging to any (including terminated) process, possibly even having claims to protect information.

Since I am also involved in the development of graphic drivers, by chance, I’ll try to briefly explain what the author of the original bug report is wrong with, the responsibility of the problem is and how it can be solved.



Regardless of the operating system, its related system APIs, and application interfaces for developing graphic applications, an arbitrary video card driver solves the following system-wide tasks:

  • Initialization of the display controller ( setting video mode, managing GPU ports, forming one / several independent images, ... );
  • Addressable memory management ( command queues, linear / tile addressing, surface allocation, address translation tables, PCI aperture extension, ... );
  • 2D acceleration ( cursor, hardware layers, alpha / chroma keys, ROP, primitives, ... );
  • 3D acceleration ( OpenGL, OpenGL ES / EGL, OpenVG / EGL, OpenCL, Open * );
  • Video decoding / audio playback / EDID subtraction / frame buffer compression, ...

The approaches used at each stage to the solution of the assigned tasks have long been reduced to established practices. This just explains the reproducibility of the indicated problem on devices of various manufacturers. Looking ahead, I can say that you can get a similar effect on Intel controllers. The author of the bug report absolutely precisely determined within the framework of the solution of which problem the effect arises - addressable memory management.



Memory management


The main entity that the driver is operating at this stage is the surface. A surface is generally called a continuous piece of video or RAM used to form an image by some application. For controllers that do not have their own memory, a resource allocated from RAM can become addressable through an address translation table (Graphics Translation Table, GTT). Otherwise, the image can be displayed only when copying the surface into video memory, either by means of the DMA controller, if any, or due to CPU resources.

In fact, even controllers with their own discrete memory in most cases also address it via GTT, since in this way it is possible to create a virtual address space by analogy with the central processor TLB to provide linear or tile addressing. The method of addressing in each case determines the driver and there is no fundamental difference between them in this article.

Example of tile addressing in video memory from source article
image

The driver of the graphics controller is the interface for the OS to the functionality of the GPU, nothing more. All tasks to ensure the protection of information are assigned to any higher level responsible for this. For this, the drivers have all the available functionality, there would be a desire to use it.

So, at the request of some client driver, the OS reserves (allocates) a set of surfaces for it. Since, according to the author, fragmentation of the image is relatively rare, we can argue that tile addressing is not often used in these cases. With linear addressing, each surface is characterized primarily by offset from the start of the virtual address space of the controller memory. when allocating memory, the driver returns the OS exactly this offset, which corresponds to a free memory block that can accommodate the surface with the characteristics requested by the application software. In this case, the driver performs only the following actions: modifies GTT for further use of virtual memory pages, monitors compliance with the requirements for alignment of physical addresses,

Based on the foregoing, we can conclude that, having information about the total amount of available controller memory and the required surface characteristics, the offset determination for all controllers can be solved uniformly. In practice, this is the case (guided by experience with various unix-like operating systems): the OS provides a system service / library that stores a list of already used memory blocks and allows you to quickly calculate the first available offset for logical backup within this library. At the same time, having information from the driver about the mechanism of access to the memory block, the OS generally allows for application software to be formed at the same physical addresses of shared / intersecting surfaces.

If a driver does not explicitly control surfaces, then how is acceleration implemented?
Everything is quite simple. When a certain application needs, for example, to perform hardware-based movement of a fragment of an image from one surface to another (hardware blitting), it transfers the surface descriptors and fragment coordinates to the OS.

The driver receives only offset (in bytes) in the video memory, the coordinates of the fragments relative to these offset (in pixels), color depth and some information for performing pixel operations.




Returning to the original problem


Surely at this point, many have already guessed what the criticism comes down to. When executing various applications, the OS asks for the surface of the video driver for their needs and reuses them as the memory becomes free (process termination). At the same time, the driver cannot be aware that some memory block requires immediate resetting, because it has certain security requirements and has no other links. Zeroing memory itself is a trivial task of hardware filling a rectangle.

In fact, the test written to publish the bug report is redundant. In the general case (when video memory fits into a PCI / GPU aperture), no application APIs are required for unix-like systems. It is enough to contact / dev / mem using the offset known from the output of the “lspci” utility.

For Intel controllers, the situation is different, but not much. Since the controller does not have its own memory, GTT is formed on the fly with the allocation of memory from RAM. When re-allocating the surface, it just might not be lucky with the actual location of the RAM block, given that in this case the OS virtual addressing mechanism already plays a decisive role.

There are several solutions, and I believe that the conclusions will be obvious:
  • All manufacturers of drivers must implement redundant functionality for storing information about existing surfaces (the question of controlling shared surfaces remains open );
  • The OS must monitor the need for this or that application to clean up ( this is either a kind of marking of the protected surfaces, or excessive zeroing of any surfaces both in RAM and in video memory );
  • Application software should correctly clean up after itself, since it claims to be involved in information security.

Hope the note was interesting.

Also popular now: