Apple Metal in MAPS.ME

    imageHello!

    In the world there are a huge number of applications on OpenGL, and it seems that Apple is not quite agree with this. Starting with iOS 12 and MacOS Mojave, OpenGL has been rendered obsolete. We integrated Apple Metal into MAPS.ME and are ready to share our experience and results. Let us tell you how our graphics engine refactored, what difficulties we had to face and, most importantly, how many FPS we have now.

    Anyone who is interested or is thinking about adding support for Apple Metal in the graphics engine, we invite under the cat.

    Problematics


    Our graphics engine was designed as a cross-platform one, and since OpenGL is, in fact, the only cross-platform graphics API for a set of platforms of interest to us (iOS, Android, MacOS and Linux), we chose it as the basis. We didn’t do an extra layer of abstraction that would hide OpenGL-specific features, but, fortunately, left the potential for its implementation.

    With the advent of the graphics API of the new generation of Apple Metal and Vulkan, we, of course, considered the possibility of their appearance in our application, however, the following stopped us:

    1. Vulkan could only work on Android and Linux, and Apple Metal only on iOS and MacOS. We did not want to lose cross-platform at the level of the graphics API, it would complicate the development and debugging processes, increase the workload.
    2. An application on Apple Metal cannot be compiled and launched on an iOS simulator (by the way, so far), which would also complicate our development and prevent us from completely getting rid of OpenGL.
    3. The Qt Framework, which we use to create internal tools, only supported OpenGL ( now supported by Vulkan ).
    4. Apple Metal did not have and does not have a C ++ API, which would force us to invent abstractions not only for the execution stage, but also for the application assembly stage, when part of the engine is compiled into Objective C ++, and the other, substantially large, in C ++.
    5. We were not ready to make a separate engine or a separate branch of code specifically for iOS.
    6. The implementation was estimated at least in half a year of work of one graphic developer.

    When in the spring of 2018, Apple announced the transfer of OpenGL to deprecated status, it became clear that it was no longer possible to postpone, and the problems described above had to be solved in one way or another. In addition, we have long been working to optimize both the speed of the application and energy consumption, and Apple Metal, it seemed, could help with this.

    Decision making


    Almost immediately, we noticed MoltenVK . This framework emulates the Vulkan API using Apple Metal, and its source code was recently discovered. Using MoltenVK, it seemed, would have replaced OpenGL with Vulkan, and not at all engaged in the separate integration of Apple Metal. In addition, Qt developers abandoned separate rendering support for Apple Metal in favor of MoltenVK. However, we were stopped:

    • the need to support Android devices on which Vulkan is unavailable;
    • the inability to start on the iOS simulator without a fallback on OpenGL;
    • the inability to use Apple tools for debugging, profiling and precompiling shaders, since MoltenVK generates real-time shaders for Apple Metal from source codes on SPIR-V or GLSL;
    • the need to wait for updates and bugfixes of MoltenVK when new versions of Metal are released;
    • the impossibility of fine optimization, specific to Metal, but not specific or not existing for Vulkan.

    It turned out that OpenGL we need to save, and therefore can not do without abstracting the engine from the graphics API. Apple Metal, OpenGL ES, and in the future, Vulkan, will be used to create independent internal components of the graphics engine, which can be completely interchangeable. OpenGL will play the role of a fallback option in cases where Metal or Vulkan is unavailable for one reason or another.

    The implementation plan was:

    1. Refactoring the graphics engine to abstract the used graphics API.
    2. Render to Apple Metal for the iOS version of the application.
    3. Make appropriate benchmarks for rendering speed and power consumption to see if modern, lower-level graphics APIs can benefit the product.

    Key differences between OpenGL and Metal


    To understand how to abstract the graphics API, let's first determine what key conceptual differences exist between OpenGL and Metal.

    1. It is believed, and rightly so, that Metal is a lower-level API. However, this does not mean that you have to write in assembler or implement rasterization yourself. Metal can be called a low-level API in the sense that it performs a very small number of implicit actions, that is, almost all actions must be written to the programmer himself. OpenGL does a lot of things implicitly, starting from supporting an implicit reference to the OpenGL context and relating this context to the thread in which it was created.
    2. In Metal, there is no realtime validation of commands. In debug mode, validation, of course, exists and is done significantly better than in many other APIs, largely due to the tight integration with XCode. But when the program is sent to the user, then there is no longer any validation, the program simply crashes on the first error. Needless to say that OpenGL falls only in the most extreme cases. The most common practice is to ignore the error and continue working.
    3. Metal can precompile shaders and build libraries from them. In OpenGL, shaders are compiled from source when the program is running, a specific low-level implementation of OpenGL on a particular device is responsible for this. Difference and / or errors in the implementation of shader compilers sometimes lead to fantastic bugs, especially on Android devices of Chinese brands.
    4. OpenGL makes extensive use of the state machine, which adds side effects to almost every function. Thus, OpenGL functions are not pure functions, and order and call history are often important. Metal does not use states implicitly and does not save them longer than necessary for rendering. States exist as previously created and validated objects.

    Graphics Engine Refactoring and Embedding Metal


    The process of refactoring the graphics engine, basically, was to find the best solution to get rid of the features of OpenGL, which our engine actively used. Embedding Metal, starting with one of the stages, went in parallel.

    • As already noted, in the OpenGL API there is an implicit entity called the context. The context is associated with a specific stream, and the OpenGL function called in this stream itself finds and uses this context. Metal, Vulkan (yes, and other APIs, such as Direct3D) do not work this way, they have similar explicit objects called device or instance. The user himself creates these objects and is responsible for transferring them to different subsystems. It is through these objects that all calls to graphic commands are made.

      We called our abstract object a graphical context, and in the case of OpenGL it simply decorates calls to OpenGL commands, and in the case of Metal it contains the root interface MTLDevice, through which Metal commands are called.

      Of course, we had to extend this object (and since the rendering is multi-threaded, even a few such objects) across all subsystems.

      We hid the creation of queues of commands, encoders (encoders) and their management within the graphic context, so as not to distribute to the entity engine, which simply does not exist in OpenGL.
    • The prospect of the disappearance of the validation of graphic commands on the devices of users frankly did not please us. A wide range of devices and OS versions could not be fully covered by our QA department. Therefore it was necessary to finish the expanded logs where earlier we received a sensible error from the graphics API. Of course, this validation was added only to the potentially dangerous and critical parts of the graphics engine, since covering the entire engine with a diagnostic code is almost impossible and generally harmful for performance. The new reality is that testing on users and debugging with logs is now in the past, at least in terms of rendering.
    • Our previous shader system was unsuitable for refactoring, we had to rewrite it completely. The point here is not only in the precompilation of shaders and their validation at the assembly stage of the project. In OpenGL, so-called uniform variables are used to pass parameters to shaders. The transfer of structured data is only available with OpenGL ES 3.0, and since we still support OpenGL ES 2.0, we simply did not use this method. Metal made us use data structures to pass parameters, and for OpenGL we had to invent structure mapping fields into uniform variables. In addition, I had to re-write each of the shaders in Metal Shading Language.
    • When using state objects, we had to go for a trick. In OpenGL, all states, as a rule, are set immediately before rendering, and in Metal this should be a previously created and validated object. Our engine, obviously, used the OpenGL approach, and the refactoring with the preliminary creation of state objects was commensurate with the complete rewriting of the engine. To split this node, we created a state cache inside the graphic context. The first time that a unique combination of state parameters is formed, a state object is created in Metal and placed in the cache. The second and subsequent times the object is simply retrieved from the cache. This works in our maps, since the number of different combinations of state parameters is not too large (about 20-30).

    As a result, after about 5 months of work, we were able to launch MAPS.ME for the first time with full rendering on Apple Metal. It was time to find out what we did.

    Rendering speed testing


    Experimental technique


    We used in the experiment Apple devices of different generations. All of them were updated to iOS 12. At all, the same user scenario was executed - map navigation (movement and scaling). The script was scripted to guarantee almost complete identity of the processes within the application each time it was run on each device. As a test location, the area of ​​Los Angeles was chosen - one of the most heavily loaded areas in MAPS.ME.

    First, the script was executed with rendering on OpenGL ES 3.0, then on the same device with rendering on Apple Metal. Between launches, the application is completely unloaded from memory.
    The following indicators were measured:

    • FPS (frames per second) for the entire frame;
    • FPS for the part of the frame that deals only with rendering, excluding data preparation and other frame by frame operations;
    • The percentage of slow frames (more than ~ 30 ms), i.e. those that the human eye can perceive as jerks.

    When measuring FPS, drawing directly on the device screen was excluded, since vertical synchronization with the screen refresh rate does not allow to obtain reliable results. Therefore, the frame was drawn in texture in memory. To synchronize CPU and GPU in OpenGL an additional command call was used glFinish, in Apple Metal - waitUntilCompletedfor MTLFrameCommandBuffer.

    iPhone 6siPhone 7+iPhone 8
    OpenglMetalOpenglMetalOpenglMetal
    FPS106160159221196298
    FPS (rendering only)157596247597271833
    Slow frame rate (<30 fps)4.13%1.25%5.45%0.76%1.5%0.29%

    iPhone XiPad Pro 12.9 '
    OpenglMetalOpenglMetal
    FPS145210104137
    FPS (rendering only)248705147463
    Slow frame rate (<30 fps)0.15%0.15%17.52%4.46%

    iPhone 6siPhone 7+iPhone 8iPhone XiPad Pro 12.9 '
    Frame acceleration on Metal (N times)1.51.391.521.451.32
    Acceleration of rendering on Metal (N times)3.782.413.072.843.15
    Improvement in slow frames (N times)3.37.175.17one3.93

    Results analysis


    On average, the increase in frame performance when using Apple Metal was 43%. The minimum value is fixed on iPad Pro 12.9 '- 32%, the maximum - 52% on the iPhone 8. Dependency is viewed: the smaller the screen resolution, the more Apple Metal exceeds OpenGL ES 3.0.

    If we evaluate the part of the frame that is directly responsible for rendering, then on average, the rendering speed on Apple Metal has increased 3 times. This suggests a significantly better organization, and, as a result, the efficiency of the Apple Metal API compared to OpenGL ES 3.0.

    The number of slow frames (more than ~ 30 ms) on Apple Metal has decreased by about 4 times. This means that the perception of animations and moving around the map has become smoother. The worst result is fixed on iPad Pro 12.9 'with a resolution of 2732 x 2048 pixels: OpenGL ES 3.0 gives about 17.5% of slow frames, while Apple Metal only has 4.5%.

    Energy Testing


    Experimental technique


    Power consumption was tested on iPhone 8 on iOS 12. The same user scenario was executed - navigation on the map (moving and scaling) for 1 hour. The script was scripted to guarantee almost complete identity of the processes within the application at each launch. A Los Angeles area was also chosen as a test location.

    We used the following approach to measuring energy consumption. The device is not connected to charging. In the developer's settings, power consumption logging is enabled. Before the start of the experiment, the device is fully charged. The end of the experiment comes at the end of the script. At the end of the experiment, the state of charge of the battery was recorded, and the energy logs were imported into the utility for profiling the battery in XCode. We recorded how much of the charge was spent on the work of the GPU. In addition, here we have additionally rendered a rendering, including the display of the metro map and full-screen anti-aliasing.

    The brightness of the screen did not change in all cases. No other processes, except system and MAPS.ME, were executed. Airplane mode was turned on, Wi-Fi and GPS were turned off. Additionally, several control measurements were performed.

    As a result, a comparison of Metal with OpenGL was formed for each of the indicators, and then the coefficients of the relationship were averaged to get one aggregated estimate.

    OpenglMetalGrowth
    Spent battery charge32%28%12.5%
    Profiling Battery Usage in Xcode1.95%1.83%6.16%

    Results analysis


    On average, the power consumption of the version with rendering to Apple Metal has slightly improved. The power consumption of our application GPU does not have too much impact, about 2%, because MAPS.ME can not be called highly loaded in terms of the use of the GPU. A small gain is probably achieved by reducing the computational cost when preparing commands for the GPU on the CPU, which, unfortunately, cannot be distinguished using the profiling tools.

    Results


    Embedding Metal cost us 5 months of development. This involved two developers, however, almost always take turns. We obviously won a lot in rendering performance, we won a little in terms of power consumption. In addition, we were able to embed new graphical APIs, in particular, Vulkan, with much less effort. Almost completely "sifted through" the graphics engine, as a result, found and fixed a few old bugs and performance problems.

    To the question whether our project really needs rendering on Apple Metal, we are ready to answer in the affirmative. It's not so much the fact that we love innovation, or that Apple can finally abandon OpenGL. Just in the yard in 2018, and OpenGL appeared in the distant 1997, it's time to take the next step.

    PSSo far we have not launched a feature on all iOS devices. To enable it manually, type the command in the search bar ?metaland restart the application. To return rendering to OpenGL, enter the command ?gland restart the application.

    PPS MAPS.ME is an open-source project. You can find the source code on github .

    Also popular now: