As we wrote the network code of mobile PvP shooter: player synchronization on the client

In a previous article, we reviewed the technologies used in our new project - fast paced shooter for mobile devices. Now I want to share how the client part of the network code of the future game is arranged, what difficulties we encountered and how we solved them.

In general, the approaches to the creation of fast multiplayer games over the past 20 years have not changed much. There are several methods in the network code architecture:

Calculating the state of the world on the server, and displaying the results on the client without prediction for the local player and with the possibility of losing the player’s input (input). This approach, by the way, is used in our other development project - you can read about it here .
Lockstep .
Synchronize the state of the world without deterministic logic with prediction for a local player.
Input synchronization with fully deterministic logic and prediction for a local player.

The peculiarity lies in the fact that responsiveness of control is maximally important in shooters - the player presses the button (or moves the joystick) and wants to immediately see the result of his action. First of all, because the state of the world in such games changes very quickly and it is necessary to immediately respond to the situation.

As a result, the project did not fit the approaches without the mechanism of predicting the actions of the local player (prediction) and we stopped at the method with the synchronization of the state of the world, without deterministic logic.

Plus approach: less complexity in implementation compared to the synchronization method for the exchange of input.
Minus:increase in traffic when sending the entire state of the world to the client. We had to apply several different traffic optimization techniques to keep the game stable in the mobile network.

At the heart of the gameplay architecture we have ECS, about which we have already spoken . This architecture allows you to conveniently store data about the game world, serialize, copy and transmit them over the network. And also to execute the same code both on the client and on the server.

The simulation of the game world takes place at a fixed frequency of 30 ticks per second. This allows you to reduce the lag on player input and almost do not use interpolation to visualize the state of the world. But there is one significant drawback that should be considered when developing such a system: for the local player's prediction system to work correctly, the client must simulate the world with the same frequency as the server. And we spent a lot of time to optimize the simulation enough for the target devices.

The mechanism for predicting the actions of a local player (prediction)

The mechanism of prediction on the client is implemented on the basis of ECS due to the execution of the same systems both on the client and on the server. However, the client does not run all the systems, but only those that are responsible for the local player and do not require actual data about other players.

Example lists of systems running on the client and server:

At the moment, we have about 30 systems running on the client and providing player prediction and about 80 systems running on the server. But we do not fulfill the predictions of such things as dealing damage, using abilities, or treating allies. There are two problems with these mechanics:

The client knows nothing about entering other players and predicting things like damage or healing will almost always diverge from the data on the server.
Creating new entities locally (shots, projectiles, unique abilities) generated by one player carries the problem of matching with the entities created on the server.

For such mechanics, the lag is hidden from the player in other ways.

Example: we draw the effect of hitting a shot right away, and update the life of the enemy only after we receive confirmation of the hit from the server.

The general scheme of the network code in the project

Client and server synchronize time by tick numbers. Due to the fact that data transmission over the network requires some time, the client is always ahead of the server by half the size of the RTT + input buffer size on the server. The diagram above shows that the client sends an input for tick 20 (a). At the same time, a tick of 15 (b) is processed on the server. By the time the client reaches the server, a tick of 20 will be processed on the server.

The whole process consists of the following steps: the client sends the player's input to the server (a) → this input will be processed on the server after HRTT + input buffer size (b) → The server sends the resulting state of the world to the client (s) → the client will apply the confirmed state of the world from the server through the time RTT + input buffer size + game state interpolation buffer size (d).

After the client receives a new confirmed state of the world from the server (d), it needs to perform the reconciliation process. The fact is that the client performs the prediction of the world based only on the input of the local player. The inputs of the other players are not known to him. And when calculating the state of the world on a server, a player may be in a different state, different from what the client predicted. This can happen when a player is stunned or killed.

The reconciliation process consists of two parts:

Comparison of the predicted state of the world for tick N, obtained from the server. Only data related to the local player are involved in the comparison. The rest of the world data is always taken from the server state and does not participate in the reconciliation.
During the comparison, two cases may occur:

- if the predicted state of the world coincided with the confirmation from the server, the client, using the predicted data for the local player and new data for the rest of the world, continues to simulate the world in the usual way;
- if the predicted state did not match, then the client uses the entire server state of the world and the history of inputs from the client and recounts the new predicted state of the player’s world.

In code, it looks like this:

GameState Reconcile(int currentTick, ServerGameStateData serverStateData,   GameState currentState, uint playerID)
{
  var serverState =  serverStateData.GameState;
  var serverTick = serverState.Time;
  var predictedState = _localStateHistory.Get(serverTick);
  //if predicted state matches server last state use server predicted state with predicted playerif (_gameStateComparer.IsSame(predictedState, serverState, playerID))
  {
     _tempState.Copy(serverState);
     _gameStateCopier.CopyPlayerEntities(currentState, _tempState, playerID);
     return _localStateHistory.Put(_tempState); // replace predicted state with correct server state
  }
  //if predicted state doesn't match server state, reapply local inputs to server statevar last = _localStateHistory.Put(serverState); // replace wrong predicted state with correct server statefor (var i = serverTick; i < currentTick; i++) 
  {
     last = _prediction.Predict(last); // resimulate all wrong states
  }
  return last;
}

Comparison of two states of the world occurs only for those data that relate to the local player and participate in the prediction system. Data retrieval occurs by player ID.

Comparison method:

publicboolIsSame(GameState s1, GameState s2, uint avatarId)
    {
        if (s1 == null && s2 != null ||  s1 != null && s2 == null)
            returnfalse;
        if (s1 == null && s2 == null)
            returnfalse;
        var entity1 = s1.WorldState[avatarId];
        var entity2 = s2.WorldState[avatarId];
        if (entity1 == null && entity2 == null)
            returnfalse;
        if (entity1 == null || entity2 == null)
            returnfalse;
        if (s1.Time != s2.Time)
            returnfalse;
        if (s1.WorldState.Transform[avatarId] != s2.WorldState.Transform[avatarId])
            returnfalse;
        foreach (var s1Weapon in s1.WorldState.Weapon)
        {
            if (s1Weapon.Value.Owner.Id != avatarId)
                continue;
            var s2Weapon = s2.WorldState.Weapon[s1Weapon.Key];
            if (s1Weapon.Value != s2Weapon)
                returnfalse;
            var s1Ammo = s1.WorldState.WeaponAmmo[s1Weapon.Key];
            var s2Ammo = s2.WorldState.WeaponAmmo[s1Weapon.Key];
            if (s1Ammo != s2Ammo)
                returnfalse;
            var s1Reload = s1.WorldState.WeaponReloading[s1Weapon.Key];
            var s2Reload = s2.WorldState.WeaponReloading[s1Weapon.Key];
            if (s1Reload != s2Reload)
                returnfalse;
        }
        if (entity1.Aiming != entity2.Aiming)
            returnfalse;
        if (entity1.ChangeWeapon != entity2.ChangeWeapon)
            returnfalse;
        returntrue;
    }

Comparison operators for specific components are generated along with the entire EC structure, specially written by the code generator. For example, I will give the generated code of the comparison operator Transform component:

Code

publicstaticbooloperator ==(Transform a, Transform b)
{
    if ((object)a == null && (object)b == null)
        returntrue;
    if ((object)a == null && (object)b != null)
        returnfalse;
    if ((object)a != null && (object)b == null)
        returnfalse;
    if (Math.Abs(a.Angle - b.Angle) > 0.01f)
        returnfalse;
    if (Math.Abs(a.Position.x - b.Position.x) > 0.01f || Math.Abs(a.Position.y - b.Position.y) > 0.01f)
        returnfalse;
    returntrue;
}

It should be noted that the float values are compared with a rather high error. This is done in order to reduce the number of desynchronizations between the client and the server. For the player, such an error will be invisible, but it significantly saves the computing resources of the system.

The complexity of the reconciliation mechanism is that in the case of misalignment of the client and server state (misprediction), it is necessary to re-simulate all the predicted client states that are not yet acknowledged from the server, up to the current tick in one frame. Depending on the player's ping, this can be from 5 to 20 ticks of the simulation. We had to significantly optimize the simulation code in order to invest in the time frame: 30 fps.

To perform the reconciliation process on the client, you need to store two types of data:

The history of the predicted states of the player.
And the history of input.

For these purposes, we use a cyclic buffer. Buffer size is 32 ticks. That at a frequency of 30 HZ gives about 1 second of real time. The client can continue to work smoothly on the prediction mechanism, without receiving new data from the server, up to the filling of this buffer. If the difference between the time of the client and the server begins to be more than one second, then the client is forcibly disconnected with an attempt to reconnect. We have such a buffer size due to the cost of the reconciliation process in the event of a discrepancy between the states of the world. But if the difference between the client and the server is more than one second, it is cheaper to perform a full reconnection to the server.

Decreased lag time

The diagram above shows that there are two buffers in the data transfer scheme in the game:

input buffer on the server;
world state buffer on the client.

The purpose of these buffers is the same - to compensate for network jumps (jitter). The fact is that the transmission of packets over the network is uneven. And since the network engine operates at a fixed frequency of 30 HZ, the data at the entrance to the engine must be fed at the same frequency. We are not able to “wait” for a few ms until the next packet reaches the recipient. We use buffers for input data and states of the world in order to have time to compensate for jitter. We also use the gamestate buffer for interpolation if one of the packets is lost.

At the start of the game, the client starts synchronization with the server only after it receives several states of the world from the server and the gamestate buffer is full. Usually the size of this buffer is 3 ticks (100 ms).

At the same time, when the client is synchronized with the server, it “runs” ahead of the server time by the amount of the input buffer to the server. Those. the client controls how far ahead the server is. The starting size of the input buffer is also equal to 3 ticks (100 ms).

Initially, we implemented the size of these buffers as constants. Those. Regardless of whether there was a real jitter on the network or not, there was a fixed delay of 200 ms (input buffer size + game state buffer size) for updating data. If we add to this the average estimated ping on mobile devices somewhere in 200 ms, then the real delay between the application of the input on the client and the confirmation of the application from the server left 400 ms!

It did not suit us.

The fact is that some systems are executed only on the server - such as, for example, the calculation of the HP player. With such a delay, the player makes a shot and only after 400 ms sees how the opponent kills. If this happened on the move, then usually the player managed to run over the wall or into the shelter and was already dying there. Playtests inside the team showed that such a delay completely breaks the entire gameplay.

The solution to this problem was the implementation of dynamic sizes of input buffers and gamestates:

for the gamestate buffer, the client always knows the current buffer content. At the moment of calculating the next tick, the client checks how many steits are already in the buffer;
for the input buffer - the server, in addition to the gamestate, began to send the value of the current input buffer for the specific client to the client. The client in turn analyzes these two values.

The algorithm for changing the size of the gamestate buffer is as follows:

The client considers the average size of the buffer over a period of time and variance.
If the variance is within the normal range (i.e., there have not been big jumps in filling and reading from the buffer for a specified period of time), the client checks the value of the average buffer size for this period of time.
If the average buffer filling was greater than the upper boundary condition (i.e., the buffer would be filled more than required), the client “reduces” the buffer size by completing an additional simulation tick.
If the average buffer filling was less than the lower boundary condition (i.e., the buffer did not have time to fill before the client started reading from it), in this case the client “increases” the buffer size by skipping one simulation tick.
In the case when the variance was above the norm, we cannot rely on this data, because network jumps over a given period of time were too large. Then the client discards all current data and starts collecting statistics again.

Lag compensation on the server

Due to the fact that the client receives updates of the world from the server with a delay (lag), the player sees the world a little differently from how it exists on the server. The player sees himself in the present, and the rest of the world - in the past. On the server, the whole world exists in the same time.

Because of this, there is a situation in which the player locally shoots at a target that is located on a server in another place.

To compensate for the lag, we use time rewind on the server. The algorithm works like this:

The client with each input additionally sends to the server the time of the tick in which he sees the rest of the world.
The server validates this time: whether the difference between the current time and the visible time of the client’s world is in the confidence interval.
If the time is valid, the server leaves the player in the current time, and the rest of the world rolls back into the past to the state that the player saw, and calculates the result of the shot.
If the player is hit, then the damage is applied in the current server time.

Time rewind on the server works as follows: the history of the world (in ECS) and the history of physics (supported by the Volatile Physics engine ) are stored in the north . At the time of miscalculation of a shot, player data is taken from the current state of the world, and the rest of the players - from history.

The shot validation system code looks like this:

publicvoidExecute(GameState gs)
{
    foreach (var shotPair in gs.WorldState.Shot)
    {
        var shot = shotPair.Value;
        var shooter = gs.WorldState[shotPair.Key];
        var shooterTransform = shooter.Transform;
        var weaponStats = gs.WorldState.WeaponStats[shot.WeaponId];
        // DeltaTime shouldn't exceed physics history sizevar shootDeltaTime = (int) (gs.Time - shot.ShotPlayerWorldTime);
        if (shootDeltaTime > PhysicsWorld.HistoryLength)
        {
            continue;
        }
        // Get the world at the time of shooting.var oldState = _immutableHistory.Get(shot.ShotPlayerWorldTime);
        var potentialTarget = oldState.WorldState[shot.Target.Id];
        var hitTargetId = _singleShotValidator.ValidateTargetAvailabilityInLine(oldState, potentialTarget, shooter,
            shootDeltaTime, weaponStats.ShotDistance, shooter.Transform.Angle.GetDirection());
        if (hitTargetId != 0)
        {    
            gs.WorldState.CreateEntity().AddDamage(gs.WorldState[hitTargetId], shooter, weaponStats.ShotDamage);
        }
    }
}

One major flaw in the approach is that we trust the client in the data on the time of the tick that he sees. Potentially, a player can gain an advantage by artificially increasing ping. Because the more a player has a ping, the farther in the past he makes a shot.

Some problems we encountered

During the implementation of this network engine, we are faced with many problems, some of them are worthy of a separate article, but here I’ll touch only on some of them.

Simulation of the whole world in the system of prediction and copying

Initially, all systems in our ECS had only one method: void Execute (GameState gs). In this method, components related to all players were usually processed.

An example of a motion system in the initial implementation:

publicsealedclassMovementSystem : ISystem
{
  publicvoidExecute(GameState gs)
  {
    foreach (var movementPair in gs.WorldState.Movement)
    {
      var transform = gs.WorldState.Transform[movementPair.Key];
      transform.Position += movementPair.Value.Velocity * GameState.TickDuration;
    }
  }
}

But in the local player prediction system, we only needed to process components related to a specific player. Initially, we implemented this by copying.

The prediction process was as follows:

Created a copy of the gamestat.
A copy was submitted to the ECS input.
Passed the simulation of the whole world in ECS.
From the newly obtained gamestate all data related to the local player was copied.

The prediction method looked like this:

voidPredictNewState(GameState state)
{
  var newState = _stateHistory.Get(state.Tick+1);
  var input = _inputHistory.Get(state.Tick);
  newState.Copy(state);
  _tempGameState.Copy(state);
  _ecsExecutor.Execute(_tempGameState, input);
  _playerEntitiesCopier.Copy(_tempGameState, newState);
}

There were two problems in this implementation:

Because we use classes, not structures - copying is quite an expensive operation for us (approximately 0.1-0.15 ms on iPhone 5S).
Simulation of the whole world also takes a lot of time (about 1.5-2 ms on the iPhone 5S).

If we consider that during the coordination process it is necessary to recount from 5 to 15 states of the world in one frame, then with such a realization everything was terribly slow.

The solution was quite simple: learn how to pretend to play the world piece by piece, namely, to feign only a particular player. We rewrote all the systems so that you can transfer the player ID and simulate only it.

An example of a motion system after a change:

publicsealedclassMovementSystem : ISystem
{
  publicvoidExecute(GameState gs)
  {
    foreach (var movementPair in gs.WorldState.Movement)
    {
        Move(gs.WorldState.Transform[movementPair.Key], movementPair.Value);
    }
  }
  publicvoidExecutePlayer(GameState gs, uint playerId)
  {
    var movement = gs.WorldState.Movement[playerId];
    if(movement != null)
    {
        Move(gs.WorldState.Transform[playerId], movement);
    }
  }
  privatevoidMove(Transform transform, Movement movement)
  {
    transform.Position += movement.Velocity * GameState.TickDuration;
  }
}

After the changes, we were able to get rid of unnecessary copies in the prediction system and reduce the load on the matching system.

Code:

voidPredictNewState(GameState state, uint playerId)
{
  var newState = _stateHistory.Get(state.Tick+1);
  var input = _inputHistory.Get(state.Tick);
  newState.Copy(state);
  _ecsExecutor.Execute(newState, input, playerId);
}

Creating and deleting entities in the prediction system

In our system, an entity mapping on the server and the client is based on an integer identifier (id). For all entities, we use end-to-end numbering of identifiers; each new entity has the value id = oldID + 1.

This approach is very convenient in implementation, but it has one major drawback: the order of creation of new entities on the client and the server may be different and, as a result, the identifiers of the entities will differ.

This problem manifested itself in us when we implemented the player's prediction system for shots. Every shot we have is a separate entity with a shot component. For each client, the entity id shots in the prediction system were consistent. But if at the same moment another player was shooting, then on the server the id of all the shots was different from the client one.

Shots on the server were created in a different order:

For shots, we bypassed this restriction, based on the gameplay features of the game. Shots are fast-living entities that are destroyed in the system within a fraction of seconds after creation. On the client, we identified a separate range of IDs that do not overlap with the server IDs and ceased to take shots into the matching system. Those. local player shots are always drawn in the game according to the prediction system and do not take into account data from the server.

With this approach, the player does not see the artifacts on the screen (delete, re-create, rollbacks of shots), and the discrepancies with the server are minor and do not affect the gameplay in general.

This method solved the problem with shots, but not the whole problem of creating entities on the client as a whole. We are still working on possible solutions for matching the created objects on the client and server.

It should also be noted that this problem only concerns the creation of new entities (with new IDs). Adding and deleting components on already created entities is performed without problems: components have no identifiers and each entity can have only one component of a particular type. Therefore, we usually create entities on the server, and in the prediction systems we only add / remove components.

In conclusion, I want to say that the task of implementing multiplayer is not the easiest and fastest, but there is quite a lot of information about how to do this.

What to read

Multiplayer in fast games - translation of the article Fast-Paced Multiplayer (Part I): Introduction (in my opinion, this is the best article on Habré about network interaction in games).
GDC Vault Overwatch Gameplay Architecture and Netcode - a lecture with GDC 17, on ECS and network code in Overwatch (unfortunately, access is paid).
GDC Vault: 8 Frames in 16ms: Rollback Networking in Mortal Kombat and Injustice 2 - about how this is done in fighting games.
Source Multiplayer Networking - how counter strike multiplayer works.
Gaffer on Games - in general about the network code in games.
UDP in Game Engines .
GDC Vault: I Shot you first networking - how multiplayer works in Halo: Reach.

Tags: