Parsing formats: sound in some games on the Unreal Engine



    The culture of modification of games originated in ancient times. The earliest I remember is Wolfenstein 3D (1992). If I’m not mistaken, I could draw my own maps, and then new enemies, replace textures and sounds. The main obstacle to modding is parsing unknown data formats. Let us leave the moral aspects of this phenomenon to other resources, and dwell on the technical difficulties that may arise in this difficult matter.

    I have accumulated quite a few stories of this kind, from the simplest ones, such as parsing a simple archive, where many thousands of game files are stored in one file, to replacing 3D models, researching and writing non-standard sound codecs. I'll tell you one of them, of medium complexity.

    Suppose you have a desire to replace certain phrases in the game, or even swipe at the full voice acting in some language for which the developers did not have the strength or resources. It would seem that you just need to record the sound, find where it is in the game, and replace the necessary files. But this is not always easy, for example, in the latest games from the Batman: Arkham series, the wwise sound engine is used, which has been integrated into the Unreal Engine for quite some time.

    I have already come across UE more than once, but, as you know, commercial developers have the ability to completely change any part of the engine code, so almost all games are unique in terms of data structures, and it is always interesting to investigate.

    First, let's see the sound files. They usually lie in the audio folder and are assembled into one large package, with the unexpected extension .WAD (hello DOOM). If you wish, you can even extract all the sounds from it, but it will be several thousand nameless files, and finding something among them will be very problematic, except for manually listening to all of them. I must say that it is often easier. Developers, for their own convenience, leave somewhere a file with a list of phrases. But this is not the case.

    It is logical to assume that since the game itself somehow finds the necessary sounds and subtitles for them, it means that this information is somewhere in the files, you just need to find it. Nowhere in the folders for localization are texts found, which means they are scattered across individual levels of the game, as is often the case. For example, take one of the .upk files with a name similar to a level and unpack it. Fortunately, there are tools for this, even with the source code.

    Inside, files of the type .RDialogueEvent are quite quickly detected, in which the texts of phrases in 11 languages ​​are visible with the naked eye.



    File names are similar to the names of the original sounds. Remarkably, now it remains only to find a correspondence between them and sound files. That's just where the problems begin. In the sound package of course there are identifiers. This is a 30-bit hash that is always used in wwise for sounds, but unfortunately they cannot be found anywhere among the dialog files. Everywhere there are some incomprehensible numbers, there is nothing similar to the sound ID, they would be immediately noticeable. On the other hand, this is understandable, because the engine is not so simple, and you can’t just take and play the sound file in the game. It is contained in the audio bank, it has many properties that impose various effects, etc.

    And then it turns out that in each folder with the dialogue there is a .akbank file - apparently this is wwise audio bank.



    Here he has just a lot of identifiers inside, having tried them at random, we find that one of them (highlighted in green) is in the sound packet. If we extract data from this identifier from there, we get a certain segment from several sounds stuck together. Convert these sounds from the wwise internal format to regular ogg. Yes, indeed, in one of them, Batman says: “I don't have time for this,” and in the other file they answer him. And the phrases just correspond to the texts of this particular dialogue.

    Already not bad! In principle, this could have been stopped: all the dialogs are arranged in folders, for each of them there is a bank with a link to the sound segment. Of course, we don’t know where which file is, but you can cut the segment into parts, listen to and put in place several phrases (and there are usually only 3-4 of them in dialogs) manually.

    But we are not looking for easy ways. Understand, so to the end. Let's check just in case, suddenly the sounds go right in order? Of course not, they are confused. Say what you like, somewhere there should be information about the connection of sounds in the segment with the text of the dialogue. I delved into various files for quite some time, hoping to find something, but it's all useless. Good. Once such a thing, unpack all the packages of the game. This is a few gigabytes, well, nothing, the first time or what? Here are just a complete search on all the data of the game also did not give anything. The only place where there are identifiers of sounds is the audio bank. It turns out that communication goes only through him. There is nothing to be done, you have to climb inside and figure out how it works.

    Now, for the sake of fidelity, we find in the game some kind of dialogue that can be quickly checked. After a spectacular introduction with a charming reporter girl and a mask show, Batman captures Hugo Strange. He speaks a couple of phrases starting with “I feel I should thank you”, then leaves, and the game begins. This is where the first conservation takes place. This moment is right for us.



    Find the phrase villain in the files. It appears in the OW_E8_Ch1z_Anim package. You won’t guess right away. Inside there is only one dialogue, which contains the entire beginning of the game. These are as many as 24 phrases, but maybe it's even good, in a mishmash of codes it is easier to find the number 24 than 1 or 2. So, we were going to examine the contents of .akBank

    The wwise format of banks is already partially explored. Let's hope that this information is enough for our purpose. Judging by the beginning of the .akbank file, it immediately contains 5 audio banks for 5 languages, the first is INT bank (English) - we'll see it.



    First, there is an incomprehensible table after the WRCS header, then quite a lot of zeros (this can be seen in the last picture), then the BKHD segment, and then the HIRC segment, which, apparently, contains a description of all audio objects. In this case, we have 79 of them (0x4F is highlighted in green). According to the description, the objects in the segment go one after another, for each type (1 byte) is indicated, then 32-bit length, and ID. The length and content of an object differs depending on the type.



    Type 2 objects are the actual sounds. Type is highlighted in red, length is highlighted in yellow. Each of them has the ID of the object itself (green) and the ID of the sound file (purple), where it is contained. Below you can see the beginning of the next object of the same type.



    Objects 3 - sound actions, it seems that each of them is “play a sound”, with some parameters unknown to us, but each of them has its own ID (gray) and the ID of the sound that actually needs to be played (green).

    Objects 4 - sound events. Very short entries, in which it is only that the event ID (blue), and also indicated that it contains only one action, and the ID of this very action (gray).

    Well, it looks like we have 24 chains of events of the following type:

    event -> action -> sound

    They are connected by identifiers, and eventually end with links to sound files. How to find the files you need? After looking for these codes, we find them in the very table at the beginning of the bank. Apparently this is a table in which it is recorded where individual sounds are located inside the sound segment. Indeed, there are just 24 elements in it, and for each file the same ID that we had in the sound object, the offset from the beginning, and the length are indicated. Congratulations! Now we can fully trace the connection from audio events in banks to individual audio files:



    That is, as the source data, we have the ID of several events, one for each phrase in the dialogue, and for each of them we can find the sound file. But how to connect them now with the dialogue itself?

    Let's try to look for these identifiers somewhere. In the dialog files, they again are not. There are some very short .akevent files in the folder - there are 24 of them too. Obviously, these are audio event files. Inside are some small numbers, all the same, they are no use. The only thing that is different there is just the id of the audio events that we found in the bank.



    Again, a dead end: there are identifiers for all events, but there is no connection between them and the text of the dialogue! Just in case, we’ll do a test: change the ID in the desired file and start the game. Yes, indeed, Hugo opens his mouth, but says nothing. So this is exactly the data by which the game finds the right sound. At the same time, we note that the subtitle is still displayed. This means that the texts of the dialogs are primary in our case, and sound is already coming from them.

    And then I recall that the UE3 engine has the habit of referring to package objects through their serial number inside the package, that is, just like they are packed inside it. Let's see the export file that is formed when we unpack the packages:



    The numbers here are decimal, and start from zero, in the game they start from 1, so it turns out that the event files in the export go under the numbers 0x35-0x4С. Let's see if they are somewhere among the dialogs. We start to look - and it is necessary, right at the beginning of the file there is this number!



    Here is the last missing link. At the same time, we find 0x2C nearby - this is the bank file number. If there are suddenly several dialogs in the folder, they can also be distinguished. Now we fully know how to find the corresponding sound from the text of the dialogue.



    This is a rather complicated interaction scheme. It seems that the developers decided not to care about convenience, and simply relied on the internal mechanisms of the engine, which led to this result in this case. And the cases, as I said, are very different. The file structure and the relationships between them can be completely different. Here we had a link to the sound from the text of the dialogue. But it happens vice versa, the sound is primary, and the text is located to it by identifier. Or the event of the game script is the primary event, and from it there are links to both sound and text. It happens that the files are not by name, but by hash. But in any case, somehow they are all connected, it remains only to find this connection.

    As a final touch, let's try to check our results. We find the dialogue file from the phrase “I feel I should thank you” that we need and replace 4B with 4C in it. We start the game, and our friend Hugo instead of this phrase pronounces meaningfully: "It will be my legacy, a monument to your failure and if you try to stop me, I guarantee everyone will know your secret."

    Let's leave this to Batman, the study can be considered complete. In writing, the process looks fast, but in fact, each stage can be accompanied by a long contemplation of hexadecimal digits, without any hope that at some point they will form meaningful chains, and you will understand what they mean. But sometimes this still happens.

    Also popular now: