BloodyPoSTaL April 16, 2019 at 11:58

GLTF and GLB Basics, Part 2

This article is a continuation of the basics of GLTF and GLB formats. You can find the first part of the article here . In the first part, we examined with you why the format was originally planned, as well as such artifacts and their attributes of the GLTF format as Scene, Node, Buffer, BufferView, Accessor and Mesh. In this article, we will consider Material, Texture, Animations, Skin, Camera, and also finish creating a minimal valid GLTF file.

Material and Texture

Materials and textures are inextricably linked to the mesh. If necessary, the mesh can be animated. The material stores information about how the model will be rendered by the engine. GLTF defines materials using a common set of parameters that are based on Physical-Based Rendering (PBR). PBR model allows you to create a “physically correct” display of the object in different light conditions due to the fact that the shading model must work with the “physical” surface properties. There are several ways to describe PBR. The most common model is the metallic-roughness model, which is used by default in GLTF. You can also use the specular-glosiness model, but only with a separate extension (extenstion). The main attributes of the material are as follows:

name is the name of the mesh.
baseColorFactor / baseColorTexture - stores color information. In the case of the Factor attribute, information is stored in a numerical value for RGBA, in the case of Texture, the link to the texture is stored in the textures object.
metallicFactor - stores Metallic information
roughnessFactor - stores information about Roughness

doubleSided - true or false (the default value) and indicates whether the mesh will be rendered on both sides or only on the "front" side.

"materials": [
    {
        "pbrMetallicRoughness": {
            "baseColorTexture": {
                "index": 0
            },
            "metallicFactor": 0.0,
            "roughnessFactor": 0.800000011920929
        },
        "name": "Nightshade_MAT",
        "doubleSided": true
    }
],

Metallic or the meaning of “metallicity”. This parameter describes how strongly reflecting it is similar to real metal, i.e. how much light is reflected from the surface. The value is measured from 0 to 1, where 0 is a dielectric and 1 is a pure metal.

Roughness or “roughness”. This attribute displays how “rough” the surface is, thereby affecting the scattering of light from the surface. Measured from 0 to 1, where 0 is perfectly flat and 1 is a completely rough surface that reflects only a small amount of light.

Texture- An object that stores texture maps (Texture maps). Such cards give a realistic model. Thanks to them, you can designate the appearance of the model, to give various properties such as metallicity, roughness, natural dimming from the environment and even the properties of the glow. Textures are described by three high-level arrays: textures, samplers, images. The Textures object uses indexes to reference sampler and image instances. The most important object is image, because It is he who stores the location information of the map. In textures, it is described by the word source. The picture may be located somewhere on the hard drive (for example, “uri”: “duckCM.png”) or encoded in GLTF (“bufferView”: 14, “mimeType”: “image / jpeg”). Samplers is an object

In our triangle example, there are no textures, but I will give JSON from other models I worked with. In this example, the textures were written to the buffer, so they are also read from buffer using BufferView:

"textures": [
        {
            "sampler": 0,
            "source": 0
        }
    ],
    "images": [
        {
            "bufferView": 1,
            "mimeType": "image/jpeg"
        }
    ],

Animations

GLTF supports articulated, skinned, and morph target animations using key frames. The information of these frames is stored in buffers and refers to animations using accessors. GLTF 2.0 only defines the animation store, so it does not define any specific runtime behavior, such as playback order, autoplay, loops, timeline display, etc. All animations are stored in the Animations array and they are defined as a set channels (channel attribute), as well as a set of samples that are defined by accessors that process information about key frames and the interpolation method (samples attribute)

The main attributes of the Animations object are as follows:

name - name of the animation (if any)
channel - an array that connects the output values of the keyframes of the animation to a specific node in the hierarchy.
sampler is an attribute that refers to Accessor, which processes key frames from the buffer.
target is an object that determines which node (Node object) needs to be animated using the node attribute, and also which property of the node needs to be animated using the path attribute - translation, rotation, scale, weights, etc. Non-animated attributes retain their values during animations. If node is not defined, then the channel attribute should be omitted.
samplers - defines input and output pairs: a set of scalar floating-point values representing linear time in seconds. All values (input / output) are stored in the buffer and are accessible through accessors. The interpolation attribute stores the interpolation value between keys.

There are no animations in the simplest GLTF. An example is taken from another file:

"animations": [
        {
            "name": "Animate all properties of one node with different samplers",
            "channels": [
                {
                    "sampler": 0,
                    "target": {
                        "node": 1,
                        "path": "rotation"
                    }
                },
                {
                    "sampler": 1,
                    "target": {
                        "node": 1,
                        "path": "scale"
                    }
                },
                {
                    "sampler": 2,
                    "target": {
                        "node": 1,
                        "path": "translation"
                    }
                }
            ],
            "samplers": [
                {
                    "input": 4,
                    "interpolation": "LINEAR",
                    "output": 5
                },
                {
                    "input": 4,
                    "interpolation": "LINEAR",
                    "output": 6
                },
                {
                    "input": 4,
                    "interpolation": "LINEAR",
                    "output": 7
                }
            ]
        },

Skin

Skinning information, also known as skinning, aka bone animation, is stored in the skins array. Each skin is defined using the inverseBindMatrices attribute, which refers to the accessor with IBM (inverse bind matrix) data. This data is used to transfer the coordinates to the same space as each joint, as well as the attribute of the joints array, which lists the indices of the nodes used as joints for skin animation. The order of the connections is determined in the skin.joints array and must match the data order of inverseBindMatrices. The skeleton attribute points to a Node object that is the common root of the hierarchy of joints, or the direct or indirect parent node of a common root.

An example of using the skin object (not in the triangle example):

    "skins": [
        {
            "name": "skin_0",
            "inverseBindMatrices": 0,
            "joints": [ 1, 2 ],
            "skeleton": 1
        }
    ]

The main attributes:

name - skinning name
inverseBindMatrices - indicates the accessor number that stores information about the Inverse Bind Matrix
joints - indicates the number of the accessor storing information about joints
skeleton - indicates the accessor number that stores information about the "root"
joint / joint from which the model skeleton begins

Camera

The camera determines the projection matrix, which is obtained by transforming the “view” into the coordinates of the clip. If it is simpler, the cameras determine the visual appearance (viewing angle, direction of “look”, etc.) that the user sees when loading the model.

The projection can be “Perspective” and “Orthogonal”. Cameras are contained in nodes and can have transformations. Cameras are fixed in Node objects and, thus, can have transformations. The camera is defined so that the local + X axis is directed to the right, the lens is looking in the direction of the local -Z axis, and the top of the camera is aligned with the local + Y axis. If the transformation is not specified, then the camera is at the origin. The cameras are stored in the cameras array. Each of them defines a type attribute that assigns a type of projection (perspective or orthogonal), as well as attributes such as perspective or orthographic, which already store more detailed information. Depending on the presence of the zfar attribute, cameras with the perspective type may use finite or infinite projection.

An example camera in JSON with type perspective. Not relevant for an example of a minimal correct GLTF file (triangle):

"cameras": [
        {
            "name": "Infinite perspective camera",
            "type": "perspective",
            "perspective": {
                "aspectRatio": 1.5,
                "yfov": 0.660593,
                "znear": 0.01
            }
        }
    ]

The main attributes of the Camera object:

name - skinning name
type - type of camera, perspective or orthographic.
perspective / orthographic - attribute containing details of the corresponding type value
aspectRatio - Aspect ratio (fov).
yfov - vertical field of view (fov) angle in radians
zfar - distance to the far clipping plane
znear - distance to the near clipping plane
extras - application specific data

Minimum Valid GLTF File

At the beginning of the article, I wrote that we will collect a minimal GLTF file that will contain 1 triangle. Buffered JSON can be found below. Just copy it to a text file, change the file format to .gtlf. To view a 3D asset in a file, you can use any viewer that supports GLTF, but I personally use this

{
  "scenes" : [
    {
      "nodes" : [ 0 ]
    }
  ],
  "nodes" : [
    {
      "mesh" : 0
    }
  ],
  "meshes" : [
    {
      "primitives" : [ {
        "attributes" : {
          "POSITION" : 1
        },
        "indices" : 0
      } ]
    }
  ],
  "buffers" : [
    {
      "uri" : "data:application/octet-stream;base64,AAABAAIAAAAAAAAAAAAAAAAAAAAAAIA/AAAAAAAAAAAAAAAAAACAPwAAAAA=",
      "byteLength" : 44
    }
  ],
  "bufferViews" : [
    {
      "buffer" : 0,
      "byteOffset" : 0,
      "byteLength" : 6,
      "target" : 34963
    },
    {
      "buffer" : 0,
      "byteOffset" : 8,
      "byteLength" : 36,
      "target" : 34962
    }
  ],
  "accessors" : [
    {
      "bufferView" : 0,
      "byteOffset" : 0,
      "componentType" : 5123,
      "count" : 3,
      "type" : "SCALAR",
      "max" : [ 2 ],
      "min" : [ 0 ]
    },
    {
      "bufferView" : 1,
      "byteOffset" : 0,
      "componentType" : 5126,
      "count" : 3,
      "type" : "VEC3",
      "max" : [ 1.0, 1.0, 0.0 ],
      "min" : [ 0.0, 0.0, 0.0 ]
    }
  ],
  "asset" : {
    "version" : "2.0"
  }
}

What is the result?

In conclusion, I want to note the growing popularity of GLTF and GLB formats, many companies are already actively using it, and some are already actively striving for this. Ease of its use on the social network Facebook (3D posts and, more recently, 3D Photos), the active use of GLB in Oculus Home, as well as a number of innovations that were announced at the GDC 2019 greatly contribute to the format’s popularization. Lightness, fast rendering speed, ease of use, promotion of the Khronos Group and standardization of the format are the main advantages, which, I am sure, will eventually do their job in further promoting it!

Tags: