Hell Visualization 1.1 - Decisions and Conclusion

Original author: Simon Trümpler
  • Transfer
Hell render 1.1:
Now the fun part! Here I will introduce you some solutions that I found during my research. I hope they give you a general idea of ​​how to optimize game resources in terms of the visualization process.

1. Sort


For starters, you can sort all your commands (for example, by Render State) before filling in the command buffer. This operation will reduce the number of changes required by the Render State to a minimum, since you can process all polygon meshes of the same type.



But there is still significant overhead due to displaying all the polygon meshes one by one. To cut them, there is a useful technique called Batching .

2. Batching


When you sort polygon meshes, you sort of group them together in groups by type. The next step is to ask the GPU to visualize each heap at a time. That's the whole point of Batching:
Batching is a grouping of several polygon meshes before calling API methods to render them. This is due to the fact that it takes less time to render one large polygonal mesh than for many small ones. [ a36 ]
So, instead of calling one Draw Call on a polygon mesh (which has the same Render State) ...



... you combine polygon meshes (with the same Render State) and display them in one Draw Call. This is a really interesting topic, because you can visualize different grids (stone, chair or sword) at the same time , while they use the same Render State (essentially, this means that they use the same material settings).



It is important to remember that the union of polygonal meshes occurs in random access memory (RAM) and only then a new large grid is transferred to the memory of the graphics card (VRAM). It takes time! Therefore, Batching is well suited for static objects (stones, houses ...), which are combined once and stored in memory for a long time. Of course, you can combine dynamic objects, such as laser shots in a space game. But since they are constantly moving, you will have to create a “cloud of shots” grid and transfer it to the GPU memory every frame!

Another point why you should be careful ( thanks koyima for reminding): if the object does not fall within the camera’s field of view, it can simply be discarded (ignored when displayed). But if you group several objects together, during visualization you will have to take into account the whole polygonal mesh as a whole (even if only a small part of it is really visible). In some cases, this can cause a decrease in performance.

A more suitable solution for processing dynamic objects is instantiation .

3. Instances


To instantiate means to send only one polygon mesh (for example, a laser shot) instead of several, and let the GPU duplicate it several times. Drawing the same object in the same position with the same rotation or animation is pretty boring. Therefore, you can transfer a stream of additional data, such as a transformation matrix, to visualize duplicates in different positions (and different poses).
Typical attributes for a copy are the model-to-world transformation matrix, the color of the copy, and the animation player along with the bones. [ a37 ]
Don't hit me hard, but as I understand it, this data stream is just a list in RAM to which the GPU has access.

In total, only one Draw Call per polygon mesh type is required! Compared to Batching, the difference is that all instances look the same (since the same mesh is copied), while a merged mesh can consist of several different ones, provided that they use the same Render State settings.



Further, everything will be a little more unusual. I think the following tricks are cool, even if they are only suitable for special occasions:

4. Shader for multi-materials


A shader can have access to several textures and therefore you can use not only one diffuse / normal / reflective texture, but two, for example. Naturally, this means that you can combine two materials in one shader. Materials are mixed together, and the degree of mixing is determined by the controlling texture. Of course, this requires additional costs from the GPU, since mixing is an expensive operation, but it reduces the number of Draw Calls due to the fact that you no longer have to “tear” the polygonal mesh into parts (see “4. Polygonal meshes and multimaterials” )

You can read more about this here.
The documentation says that more Draw Calls are still better than this expensive technique. Nevertheless, it seemed very interesting to me and, if you need good numbers for statistics, you can say that “layered” materials reduce the number of Draw Calls (let it say nothing about performance ... but shhhhh!).

5. “Skin” polygonal mesh


Remember the above said about the laser shot grid? I said that this polygon mesh should be updated every frame, as the shots are constantly moving. Combining them together and sending the result every frame is quite expensive. An interesting approach to solving this problem is to automatically add bone to each shot and transmit information as “skin”. Thus, you can use one large polygonal mesh, which can remain in memory, and you will update each frame only information about the bones. Of course, if a new shot object is made or the old one is destroyed, you will have to recreate the polygon mesh. But that sounds like a really interesting idea, it seems to me.

You can read more about this here.
Feel free to send me more links about unusual solutions to reduce the number of Draw Calls!
Almost all! Now you have a certain understanding of what can be done to render game resources a little faster. Don't worry, the next book will be short.

the end

Hell render 1.1:
Here I will briefly outline what we have already studied:

Avoid small polygon meshes


Check the need for small grids or is it possible to combine several small ones into one big one. Talk to the graphic programmer to get information about the “golden mean” in the number of polygons (maximum triangles, which render no performance loss). You might want to add some triangles to smooth the corners. You should also monitor the multimedia. If you assembled one large polygonal mesh, but with 5 assigned materials, then the large mesh will be split for visualization, which means that you still have 5 small grids. Maybe the texture atlas will help you?

Avoid too much material.


Speaking of materials, think about managing them. Sharing materials between game resources may be possible if you plan it before creating the resource. Large texture atlases can help you.

Debugging Tools


Discuss with programmers whether it is possible to obtain in-game statistics in order to understand how problematic one or another game resource can be. Sometimes it’s hard to get a general idea of ​​complex game resources. But if the tool can warn you that some resource has potential performance problems, then you will be able to solve the problem before it is fixed as final.

Ask encoder


As you can see, this topic is very technical and highly dependent on the context (hardware, engine, driver, game prospects ...). Therefore, of course, it is a good idea to talk with the programmer about how to configure game resources. Or just wait, if the performance loss is due to your resources, then the programmers will find you and poke you until you optimize everything you did. :)
Do you know the tips I should add here? Let me know!
Wow, have you read up to here? You are crazy! Many thanks! Tell us what you think about it. I hope you have learned something new for yourself. :)

The end


[a36] Technical Breakdown - Assassins Creed II
[a37] NVidia GPU Gems 2

Also popular now: