Reverse Engineering Star Wars: Yoda Stories

Original author: Zach Barth
  • Transfer
image

[Note Lane: Zach Bart from Zachtronics not only writes wonderful video puzzle games (SpaceChem, TIS-100, SHENZHEN I / O) and upgrades electronic squeakers , but also enjoys the reverse development of resources of old games. This article is about successfully hacking his childhood favorite game.]

Foreword


I don’t know why, but I was always attracted to reverse engineering of computer game data files. Decompiling the game code is a difficult task, and analyzing data files is often much easier (because they contain a lot of clearly visible content, for example, text and sprites) and allows you to modify the game if you understand the files well enough.

In 2006, a few days after the release of the Star Wars: Empire at War demo , I published a couple of rudimentary tools that allowed you to dump and repack the game data files, including a simple mod that allowed you to play for the Empire (this feature was disabled in the demo). Almost no trace of my work can be found on the netbut I managed to get a free Petroglyph game developer t-shirt, and that’s something.

Many years before, I had another Star Wars game called Star Wars: Yoda Stories . She was rather confused and received bad reviews, but that did not stop me. As a novice programmer and stubborn Star Wars fan, I tried to find game resources to make my own terrible Star Wars game. However, I managed to detect only sound effects and a small number of sprites that spread like desktop theme icons.


Fast forward to sixteen years: I’m studying my ancient collection of CDs looking for old games that you can play at a computer party in the 1990s theme. Having inserted the CD into the drive, I immediately realized that I had found a data file of about four megabytes in size, which was waiting for my knowledge to be used in college for hacking. Better later than never!

File Structure (Difficulty: Padawan)


I think that a hex editor will be enough for you to reverse engineer something similar, but later we will use a decompiler, a calculator and knowledge of the corresponding programs. I am a big fan of HxD , so I will use this editor.

If you want to experiment yourself, here is a link to the game data file: YODESK.DTA

It's time to open this file!


So, this is definitely not a text and is not intended for human reading, but it does not surprise us at all. I’m sure that sixteen years ago I opened the same file in Notepad and immediately closed it, without even having a close understanding of what I saw.

First, we see four-letter ASCII characters that seem to me like section identifiers. Dipping below, we see confirmation of this: SNDS, TILE, CHAR, PUZ2 and much more. The file ends with an ENDF section, and this implies that the general structure of the file is some kind of hierarchy or list of sections with labels.

The VERS identifier obviously begins the "version" section, which contains the following four bytes: 0x00, 0x02, 0x00, 0x00. I suspect this is version 2.0 of the file format, because Yoda Stories came out after the Indiana Jones game, which seemed to use the same engine. However, this is not particularly important, because we are not very interested in this piece of data.

Next comes the STUP (setup?) Section, which contains a lot of cryptic data:


Obviously, there is some kind of pattern here, but without a detailed knowledge of the game, we cannot understand what it is for. The following question seems to me more important: how to skip it? Although it can be assumed that this section has a fixed length, this is actually not the case.

If we look again at the beginning of the section (previous screenshot), we will see that four suspicious bytes follow the STUP identifier: 0x00, 0x44, 0x01, 0x00. If we check the data remaining in this section after these four bytes, we will see that they have a length of exactly 0x00014400 bytes. Coincidence? I don’t think so!

Obviously, these four bytes are an unsigned 32-bit integer indicating the amount of data making up the remainder of the STUP section. If it seems to you that the bytes are written in the reverse order, then you are right: they are stored inorder from lowest to highest (little-endian) , in which the least significant byte is stored first - a common practice for x86 and x86-64 processors. If we consider the length value, then we can skip the rest of the section without knowing anything about the data stored in it.

Reading binary files manually, even with a size of only 4 MB, is not a very productive way of working, so it is time to start working with a program that parses a file and / or dumps data dumps in the process of finding out how they are encoded. I prefer the C # programming language, so I will use it; Assuming the file format is not completely confused, I can create a fairly detailed BinaryReader class and get started. Here is the program that we have prepared so far:

static void Main(string[] args)
{
	using (BinaryReader binaryReader = new BinaryReader(File.OpenRead("YODESK.DTA")))
	{
		bool keepReading = true;
		while (keepReading)
		{
			string section = new string(binaryReader.ReadChars(4));
			switch (section)
			{
				case "VERS":
					uint version = binaryReader.ReadUInt32();
					break;
				case "STUP":
					uint length = binaryReader.ReadUInt32();
					byte[] data = binaryReader.ReadBytes((int)length);
					break;
				default:
					throw new Exception("Unknown section: " + section);
			}
		}
	}
}

Unfortunately, it will work only for the first two sections: as soon as we get to the third section (SNDS), it becomes obvious that we will have to process all the options that are in the file. This often happens when reverse engineering file formats, because they have many different values ​​that have one of many types; so we need to understand every type that we can meet. Fortunately, in almost all sections of this file, the section identifier is followed by its unsigned 32-bit length, that is, we can use the STUP section processing code to skip them.

static void Main(string[] args)
{
	using (BinaryReader binaryReader = new BinaryReader(File.OpenRead("YODESK.DTA")))
	{
		bool keepReading = true;
		while (keepReading)
		{
			string section = new string(binaryReader.ReadChars(4));
			switch (section)
			{
				case "VERS":
					uint version = binaryReader.ReadUInt32();
					break;
				case "STUP":
				case "SNDS":
				case "ZONE":
				case "TILE":
				case "PUZ2":
				case "CHAR":
				case "CHWP":
				case "CAUX":
				case "TNAM":
					uint sectionLength = binaryReader.ReadUInt32();
					byte[] sectionData = binaryReader.ReadBytes((int)sectionLength);
					break;
				case "ENDF":
					keepReading = false;
					break;
			}
		}
	}
}



It seems that in the ZONE section its length is indicated immediately after the identifier (0x00010292), but if we jump onto this number of bytes, we will end up in the void, that is, most likely we are doing something wrong. However, it is obvious that immediately after the start there are other identifiers: IZON, IZAX, IZX2, IZX3, IZX4 and from zero to several IACT after them. This set of identifiers is repeated once more below, and then occurs several times in the file. Here, knowledge about how the game works is useful to us: one of the important “features” of Yoda Stories is that a random card is generated with each new passage, but after several games it becomes obvious that the cards are actually made up of smaller cards connected together size. The fact that IACT sections contain script-like text confirms the idea that


If we cannot trivially skip the ZONE section, then we will have to learn more about its structure so that we can analyze it and continue parsing the remainder of the document. I succeeded as a result, despite the fact that this part was much more sophisticated than the previous ones, because data that was not related to the length was added next to the section identifiers.

There are many sections of IZON, so it is obvious that there is some data that allows you to move from one to another; however, it is possible that the program simply scans the data until it finds the next line of “IZON”, although this is a very unreliable approach: who knows, maybe sometime someone will add “IZON” in the NPC dialog line! By measuring the length from the beginning of the first IZON to the next IZON, we get a length of 0x064C, which is suspiciously close to four bytes after four bytes following the identifier ZONE (0x00000646). If we take these four bytes as an identifier for the length and look at the 0x646 bytes following them, we will begin to notice the pattern:


4 байта: "ZONE"
2 байта: ????
2 байта: ????
4 байта: длина (X)
X байт: данные zone
2 байта: ????
4 байта: длина (X)
X байт: данные zone
2 байта: ????
4 байта: длина (X)
X байт: данные zone
...
2 байт: ????
4 байта: длина (X)
X байт: данные zone
4 байта: "PUZ2" (начало следующего раздела)


Further study revealed facts of varying degrees of utility:

  • The two bytes following the ZONE identifier correspond to an unsigned 16-bit integer that defines the number of zones.
  • Two bytes before the length determinant of each zone can only have values ​​0x0001, 0x0002, 0x0003 and 0x0005.
  • The first two bytes of these zones is some kind of card ID number starting with 0x0000 and increasing by one each time.

If we update the program to read the ZONE section based on this information, we can read the entire file.

static void Main(string[] args)
{
	using (BinaryReader binaryReader = new BinaryReader(File.OpenRead("YODESK.DTA")))
	{
		bool keepReading = true;
		while (keepReading)
		{
			string section = new string(binaryReader.ReadChars(4));
			switch (section)
			{
				case "VERS":
					uint version = binaryReader.ReadUInt32();
					break;
				case "STUP":
				case "SNDS":
				case "TILE":
				case "PUZ2":
				case "CHAR":
				case "CHWP":
				case "CAUX":
				case "TNAM":
					uint sectionLength = binaryReader.ReadUInt32();
					byte[] sectionData = binaryReader.ReadBytes((int)sectionLength);
					break;
				case "ZONE":
					ushort count = binaryReader.ReadUInt16();
					for (int i = 0; i < count; i++)
					{
						ushort unknown = binaryReader.ReadUInt16();
						uint zoneLength = binaryReader.ReadUInt32();
						byte[] zoneData = binaryReader.ReadBytes((int)zoneLength);
					}
					break;
				case "ENDF":
					keepReading = false;
					break;
				default:
					throw new Exception("Unknown section: " + section);
			}
		}
	}
}

Using this code, we can read the entire file without encountering any exceptions that would indicate that we made incorrect assumptions about the general structure of the file format. And although we have not yet extracted anything from it, we are already on the right track.

Tiles (Difficulty: Jedi Knight)


Now that we can parse the entire file structure, it's time to dive deeper and start dumping what we came here for: a strange but exhaustive set of game tiles.


If we go down to the middle of the TILE section, then it will immediately become obvious to us that there is something interesting here: bitmap images! Game images are 32x32 pixel sprites, so if similar patterns start appearing in a 32-byte column, then there is a high probability that this is an uncompressed bitmap format with one byte per pixel.


If we move to the beginning of the TILE section, it is easy to notice a 1024-byte (32x32) piece of data corresponding to the ASCII character space, presumably being sprite data with a four-byte prefix. If we examine the range of values ​​of these four bytes in the entire data array, we notice that they are relatively random, but often repeated and look like an array of bit flags. Perhaps bit flags describe the properties of tiles? What's the difference! We are close to extracting images! It remains for us to expand the functionality of our program to dump the alleged image data into image files and ...

4 байта: флаги
1024 байта: данные изображения
4 байта: флаги
1024 байта: данные изображения
...
4 байта: флаги
1024 байта: данные изображения




case "TILE":
{
	Directory.CreateDirectory(@"Tiles");
	uint tileSectionLength = binaryReader.ReadUInt32();
	for (int i = 0; i < tileSectionLength / 0x404; i++)
	{
		uint unknown = binaryReader.ReadUInt32();
		Bitmap tile = new Bitmap(32, 32);
		for (int j = 0; j < 0x400; j++)
		{
			byte pixelData = binaryReader.ReadByte();
			Color pixelColor = Color.FromArgb(pixelData, pixelData, pixelData);
			tile.SetPixel(j % 32, j / 32, pixelColor);
		}
		tile.Save(string.Format(@"Tiles\{0}.png", i));
	}
	break;
}



Victory! Or something like that ...

Obviously, the game has color, so just assigning a pixel value to each RGB component will not solve the problem; we will never get anything but gradations of gray. Wikipedia tells us that the standard 8-bit true color scheme is 0bRRRGGGBB, so let's try it:

case "TILE":
{
	Directory.CreateDirectory(@"Tiles");
	uint tileSectionLength = binaryReader.ReadUInt32();
	for (int i = 0; i < tileSectionLength / 0x404; i++)
	{
		uint unknown = binaryReader.ReadUInt32();
		Bitmap tile = new Bitmap(32, 32);
		for (int j = 0; j < 0x400; j++)
		{
			byte pixelData = binaryReader.ReadByte();
			byte r = (byte)((pixelData & 0xE0) << 0);
			byte g = (byte)((pixelData & 0x1C) << 3);
			byte b = (byte)((pixelData & 0x03) << 6);
			Color pixelColor = Color.FromArgb(r, g, b);
			tile.SetPixel(j % 32, j / 32, pixelColor);
		}
		tile.Save(string.Format(@"Tiles\{0}.png", i));
	}
	break;
}


Obviously, this is also the wrong pixel format, so stop guessing, let's begin to analyze the numbers themselves. Take a screenshot of the game and compare the values ​​corresponding to the colors in the final sprite:


These shades of blue are very interesting: they are very close, but have some different shades. If we take the pixel values ​​from the image processing program, we get the following values: I was too optimistic that there was some bit conversion from the pixel value to the color components, but the fact that 0x1F somehow corresponded to black (all zeros) indicates to my mistake. If decreasing a pixel value from 0x93 to 0x92 magically adds two bytes of information to the color, then it is time to recognize the inevitable: the color palette is stored somewhere, and we have no idea where it is.

0x1F 0x00 / 0x00 / 0x00
0x92 0x0B / 0x53 / 0xFB
0x93 0x00 / 0x00 / 0xFB
0x94 0x00 / 0x00 / 0xCB
0x95 0x00 / 0x00 / 0x9F
0x96 0x00 / 0x00 / 0x6F




Of course, we can take a bunch of screenshots and automate the process of comparing the pixel values, but there should be a more optimal way; palette data must be stored somewhere. If I programmed this system, I would probably just keep an array of colors in which the array index is the value of the pixel data. For fun, let's look in the file for one of the colors used in the R2D2 sprite, namely 0x0B53FB ... Nope

, nothing.

Another popular pixel format - BGR, maybe 0xFB530B will do?

Also empty.

However, we did not search in one more place: in the executable file of the game. Usually a lot of gameplay information is stored as constants in the code, perhaps this is the case in our case. A search for 0x0B53FB in a binary file does not return results, but if we look for the value of BGR ...


Oh my god, this is a color palette! The next four bytes have the value 0xFB000000, that is, the color 0x93 in BGRA, so there is a high probability that we came across something useful. If we look for data in a disassembler, we can understand a little more:


It’s hard to understand from this image, but this is our palette data, right where we expect to find an array of compile-time constants in the decompiled program. Knowing that this is the color 0x92, we can start work from the end and find the beginning of the palette in order to copy it to our program for exporting images with the correct colors:

private static readonly byte[] PaletteData = new byte[] 
{ 
	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 
	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 
	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x8B, 0x00, 0xC3, 0xCF, 0x4B, 0x00, 
	0x8B, 0xA3, 0x1B, 0x00, 0x57, 0x77, 0x00, 0x00, 0x8B, 0xA3, 0x1B, 0x00, 0xC3, 0xCF, 0x4B, 0x00, 
	0xFB, 0xFB, 0xFB, 0x00, 0xEB, 0xE7, 0xE7, 0x00, 0xDB, 0xD3, 0xD3, 0x00, 0xCB, 0xC3, 0xC3, 0x00, 
	0xBB, 0xB3, 0xB3, 0x00, 0xAB, 0xA3, 0xA3, 0x00, 0x9B, 0x8F, 0x8F, 0x00, 0x8B, 0x7F, 0x7F, 0x00, 
	0x7B, 0x6F, 0x6F, 0x00, 0x67, 0x5B, 0x5B, 0x00, 0x57, 0x4B, 0x4B, 0x00, 0x47, 0x3B, 0x3B, 0x00, 
	0x33, 0x2B, 0x2B, 0x00, 0x23, 0x1B, 0x1B, 0x00, 0x13, 0x0F, 0x0F, 0x00, 0x00, 0x00, 0x00, 0x00, 
	0x00, 0xC7, 0x43, 0x00, 0x00, 0xB7, 0x43, 0x00, 0x00, 0xAB, 0x3F, 0x00, 0x00, 0x9F, 0x3F, 0x00, 
	0x00, 0x93, 0x3F, 0x00, 0x00, 0x87, 0x3B, 0x00, 0x00, 0x7B, 0x37, 0x00, 0x00, 0x6F, 0x33, 0x00, 
	0x00, 0x63, 0x33, 0x00, 0x00, 0x53, 0x2B, 0x00, 0x00, 0x47, 0x27, 0x00, 0x00, 0x3B, 0x23, 0x00, 
	0x00, 0x2F, 0x1B, 0x00, 0x00, 0x23, 0x13, 0x00, 0x00, 0x17, 0x0F, 0x00, 0x00, 0x0B, 0x07, 0x00, 
	0x4B, 0x7B, 0xBB, 0x00, 0x43, 0x73, 0xB3, 0x00, 0x43, 0x6B, 0xAB, 0x00, 0x3B, 0x63, 0xA3, 0x00, 
	0x3B, 0x63, 0x9B, 0x00, 0x33, 0x5B, 0x93, 0x00, 0x33, 0x5B, 0x8B, 0x00, 0x2B, 0x53, 0x83, 0x00, 
	0x2B, 0x4B, 0x73, 0x00, 0x23, 0x4B, 0x6B, 0x00, 0x23, 0x43, 0x5F, 0x00, 0x1B, 0x3B, 0x53, 0x00, 
	0x1B, 0x37, 0x47, 0x00, 0x1B, 0x33, 0x43, 0x00, 0x13, 0x2B, 0x3B, 0x00, 0x0B, 0x23, 0x2B, 0x00, 
	0xD7, 0xFF, 0xFF, 0x00, 0xBB, 0xEF, 0xEF, 0x00, 0xA3, 0xDF, 0xDF, 0x00, 0x8B, 0xCF, 0xCF, 0x00, 
	0x77, 0xC3, 0xC3, 0x00, 0x63, 0xB3, 0xB3, 0x00, 0x53, 0xA3, 0xA3, 0x00, 0x43, 0x93, 0x93, 0x00, 
	0x33, 0x87, 0x87, 0x00, 0x27, 0x77, 0x77, 0x00, 0x1B, 0x67, 0x67, 0x00, 0x13, 0x5B, 0x5B, 0x00, 
	0x0B, 0x4B, 0x4B, 0x00, 0x07, 0x3B, 0x3B, 0x00, 0x00, 0x2B, 0x2B, 0x00, 0x00, 0x1F, 0x1F, 0x00, 
	0xDB, 0xEB, 0xFB, 0x00, 0xD3, 0xE3, 0xFB, 0x00, 0xC3, 0xDB, 0xFB, 0x00, 0xBB, 0xD3, 0xFB, 0x00, 
	0xB3, 0xCB, 0xFB, 0x00, 0xA3, 0xC3, 0xFB, 0x00, 0x9B, 0xBB, 0xFB, 0x00, 0x8F, 0xB7, 0xFB, 0x00, 
	0x83, 0xB3, 0xF7, 0x00, 0x73, 0xA7, 0xFB, 0x00, 0x63, 0x9B, 0xFB, 0x00, 0x5B, 0x93, 0xF3, 0x00, 
	0x5B, 0x8B, 0xEB, 0x00, 0x53, 0x8B, 0xDB, 0x00, 0x53, 0x83, 0xD3, 0x00, 0x4B, 0x7B, 0xCB, 0x00, 
	0x9B, 0xC7, 0xFF, 0x00, 0x8F, 0xB7, 0xF7, 0x00, 0x87, 0xB3, 0xEF, 0x00, 0x7F, 0xA7, 0xF3, 0x00, 
	0x73, 0x9F, 0xEF, 0x00, 0x53, 0x83, 0xCF, 0x00, 0x3B, 0x6B, 0xB3, 0x00, 0x2F, 0x5B, 0xA3, 0x00, 
	0x23, 0x4F, 0x93, 0x00, 0x1B, 0x43, 0x83, 0x00, 0x13, 0x3B, 0x77, 0x00, 0x0B, 0x2F, 0x67, 0x00, 
	0x07, 0x27, 0x57, 0x00, 0x00, 0x1B, 0x47, 0x00, 0x00, 0x13, 0x37, 0x00, 0x00, 0x0F, 0x2B, 0x00, 
	0xFB, 0xFB, 0xE7, 0x00, 0xF3, 0xF3, 0xD3, 0x00, 0xEB, 0xE7, 0xC7, 0x00, 0xE3, 0xDF, 0xB7, 0x00, 
	0xDB, 0xD7, 0xA7, 0x00, 0xD3, 0xCF, 0x97, 0x00, 0xCB, 0xC7, 0x8B, 0x00, 0xC3, 0xBB, 0x7F, 0x00, 
	0xBB, 0xB3, 0x73, 0x00, 0xAF, 0xA7, 0x63, 0x00, 0x9B, 0x93, 0x47, 0x00, 0x87, 0x7B, 0x33, 0x00, 
	0x6F, 0x67, 0x1F, 0x00, 0x5B, 0x53, 0x0F, 0x00, 0x47, 0x43, 0x00, 0x00, 0x37, 0x33, 0x00, 0x00, 
	0xFF, 0xF7, 0xF7, 0x00, 0xEF, 0xDF, 0xDF, 0x00, 0xDF, 0xC7, 0xC7, 0x00, 0xCF, 0xB3, 0xB3, 0x00, 
	0xBF, 0x9F, 0x9F, 0x00, 0xB3, 0x8B, 0x8B, 0x00, 0xA3, 0x7B, 0x7B, 0x00, 0x93, 0x6B, 0x6B, 0x00, 
	0x83, 0x57, 0x57, 0x00, 0x73, 0x4B, 0x4B, 0x00, 0x67, 0x3B, 0x3B, 0x00, 0x57, 0x2F, 0x2F, 0x00, 
	0x47, 0x27, 0x27, 0x00, 0x37, 0x1B, 0x1B, 0x00, 0x27, 0x13, 0x13, 0x00, 0x1B, 0x0B, 0x0B, 0x00, 
	0xF7, 0xB3, 0x37, 0x00, 0xE7, 0x93, 0x07, 0x00, 0xFB, 0x53, 0x0B, 0x00, 0xFB, 0x00, 0x00, 0x00, 
	0xCB, 0x00, 0x00, 0x00, 0x9F, 0x00, 0x00, 0x00, 0x6F, 0x00, 0x00, 0x00, 0x43, 0x00, 0x00, 0x00, 
	0xBF, 0xBB, 0xFB, 0x00, 0x8F, 0x8B, 0xFB, 0x00, 0x5F, 0x5B, 0xFB, 0x00, 0x93, 0xBB, 0xFF, 0x00, 
	0x5F, 0x97, 0xF7, 0x00, 0x3B, 0x7B, 0xEF, 0x00, 0x23, 0x63, 0xC3, 0x00, 0x13, 0x53, 0xB3, 0x00, 
	0x00, 0x00, 0xFF, 0x00, 0x00, 0x00, 0xEF, 0x00, 0x00, 0x00, 0xE3, 0x00, 0x00, 0x00, 0xD3, 0x00, 
	0x00, 0x00, 0xC3, 0x00, 0x00, 0x00, 0xB7, 0x00, 0x00, 0x00, 0xA7, 0x00, 0x00, 0x00, 0x9B, 0x00, 
	0x00, 0x00, 0x8B, 0x00, 0x00, 0x00, 0x7F, 0x00, 0x00, 0x00, 0x6F, 0x00, 0x00, 0x00, 0x63, 0x00, 
	0x00, 0x00, 0x53, 0x00, 0x00, 0x00, 0x47, 0x00, 0x00, 0x00, 0x37, 0x00, 0x00, 0x00, 0x2B, 0x00, 
	0x00, 0xFF, 0xFF, 0x00, 0x00, 0xE3, 0xF7, 0x00, 0x00, 0xCF, 0xF3, 0x00, 0x00, 0xB7, 0xEF, 0x00, 
	0x00, 0xA3, 0xEB, 0x00, 0x00, 0x8B, 0xE7, 0x00, 0x00, 0x77, 0xDF, 0x00, 0x00, 0x63, 0xDB, 0x00, 
	0x00, 0x4F, 0xD7, 0x00, 0x00, 0x3F, 0xD3, 0x00, 0x00, 0x2F, 0xCF, 0x00, 0x97, 0xFF, 0xFF, 0x00, 
	0x83, 0xDF, 0xEF, 0x00, 0x73, 0xC3, 0xDF, 0x00, 0x5F, 0xA7, 0xCF, 0x00, 0x53, 0x8B, 0xC3, 0x00, 
	0x2B, 0x2B, 0x00, 0x00, 0x23, 0x23, 0x00, 0x00, 0x1B, 0x1B, 0x00, 0x00, 0x13, 0x13, 0x00, 0x00, 
	0xFF, 0x0B, 0x00, 0x00, 0xFF, 0x00, 0x4B, 0x00, 0xFF, 0x00, 0xA3, 0x00, 0xFF, 0x00, 0xFF, 0x00, 
	0x00, 0xFF, 0x00, 0x00, 0x00, 0x4B, 0x00, 0x00, 0xFF, 0xFF, 0x00, 0x00, 0xFF, 0x33, 0x2F, 0x00, 
	0x00, 0x00, 0xFF, 0x00, 0x00, 0x1F, 0x97, 0x00, 0xDF, 0x00, 0xFF, 0x00, 0x73, 0x00, 0x77, 0x00, 
	0x6B, 0x7B, 0xC3, 0x00, 0x57, 0x57, 0xAB, 0x00, 0x57, 0x47, 0x93, 0x00, 0x53, 0x37, 0x7F, 0x00, 
	0x4F, 0x27, 0x67, 0x00, 0x47, 0x1B, 0x4F, 0x00, 0x3B, 0x13, 0x3B, 0x00, 0x27, 0x77, 0x77, 0x00, 
	0x23, 0x73, 0x73, 0x00, 0x1F, 0x6F, 0x6F, 0x00, 0x1B, 0x6B, 0x6B, 0x00, 0x1B, 0x67, 0x67, 0x00, 
	0x1B, 0x6B, 0x6B, 0x00, 0x1F, 0x6F, 0x6F, 0x00, 0x23, 0x73, 0x73, 0x00, 0x27, 0x77, 0x77, 0x00, 
	0xFF, 0xFF, 0xEF, 0x00, 0xF7, 0xF7, 0xDB, 0x00, 0xF3, 0xEF, 0xCB, 0x00, 0xEF, 0xEB, 0xBB, 0x00, 
	0xF3, 0xEF, 0xCB, 0x00, 0xE7, 0x93, 0x07, 0x00, 0xE7, 0x97, 0x0F, 0x00, 0xEB, 0x9F, 0x17, 0x00, 
	0xEF, 0xA3, 0x23, 0x00, 0xF3, 0xAB, 0x2B, 0x00, 0xF7, 0xB3, 0x37, 0x00, 0xEF, 0xA7, 0x27, 0x00, 
	0xEB, 0x9F, 0x1B, 0x00, 0xE7, 0x97, 0x0F, 0x00, 0x0B, 0xCB, 0xFB, 0x00, 0x0B, 0xA3, 0xFB, 0x00, 
	0x0B, 0x73, 0xFB, 0x00, 0x0B, 0x4B, 0xFB, 0x00, 0x0B, 0x23, 0xFB, 0x00, 0x0B, 0x73, 0xFB, 0x00, 
	0x00, 0x13, 0x93, 0x00, 0x00, 0x0B, 0xD3, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 
	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 
	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0xFF, 0x00, 
};
case "TILE":
{
	Directory.CreateDirectory(@"Tiles");
	uint tileSectionLength = binaryReader.ReadUInt32();
	for (int i = 0; i < tileSectionLength / 0x404; i++)
	{
		uint unknown = binaryReader.ReadUInt32();
		Bitmap tile = new Bitmap(32, 32);
		for (int j = 0; j < 0x400; j++)
		{
			byte pixelData = binaryReader.ReadByte();
			byte r = PaletteData[pixelData * 4 + 2];
			byte g = PaletteData[pixelData * 4 + 1];
			byte b = PaletteData[pixelData * 4 + 0];
			Color pixelColor = pixelData == 0 ? Color.Transparent : Color.FromArgb(r, g, b);
			tile.SetPixel(j % 32, j / 32, pixelColor);
		}
		tile.Save(string.Format(@"Tiles\{0}.png", i));
	}
	break;
}


It worked! We have extracted more than two thousand strange but charming sprites. It is worth noting that the 0x00 color is actually transparent, and this is pretty obvious when you look at the raw pixel data.

Although this is much more boring than dumping images, it’s not so difficult to decipher the flags if we export them as part of the image file name and look at them in general. The logic and careful study of small images and the numbers corresponding to them gives us the following: It is noteworthy that bits 16 and above are used with different values ​​depending on bits 0-8, and at first a little confusing if you expect that a particular bit should have only one value.

ТИПЫ ТАЙЛОВ:
bit0 = игровой объект
bit1 = без коллизий, отрисовывается позади игрока
bit2 = с коллизиями, средний слой
bit3 = блок, который можно толкать/тянуть
bit4 = без коллизий, отрисовывается перед игроком (для высоких объектов, за которыми может проходить игрок)
bit5 = тайл мини-карты
bit6 = оружие
bit7 = предмет
bit8 = персонаж

ФЛАГИ ДЛЯ ОРУЖИЯ:
bit16 = лёгкий бластер
bit17 = тяжёлый бластер ИЛИ термический детонатор (????)
bit18 = световой меч
bit19 = сила

ФЛАГИ ДЛЯ ПРЕДМЕТОВ:
bit16 = ключ-карта
bit17 = элемент головоломки
bit18 = элемент головоломки (редкий???)
bit19 = элемент головоломки (ключевой предмет???)
bit20 = локатор
bit22 = аптечка

ФЛАГИ ДЛЯ ПЕРСОНАЖЕЙ:
bit16 = игрок
bit17 = враг
bit18 = дружественный

ФЛАГИ ДЛЯ ДРУГИХ ТАЙЛОВ:
bit16 = дверь

ФЛАГИ ДЛЯ ТАЙЛОВ МИНИ-КАРТЫ:
bit17 = конкретный тайл мини-карты (дом)
bit18 = конкретный тайл мини-карты (головоломка, нерешённая)
bit19 = конкретный тайл мини-карты (головоломка, решённая)
bit20 = конкретный тайл мини-карты (шлюз, нерешённый)
bit21 = конкретный тайл мини-карты (шлюз, решённый)
bit22 = конкретный тайл мини-карты (верхняя стена, закрытая)
bit23 = конкретный тайл мини-карты (нижняя стена, закрытая)
bit24 = конкретный тайл мини-карты (левая стена, закрытая)
bit25 = конкретный тайл мини-карты (правая стена, закрытая)
bit26 = конкретный тайл мини-карты (верхняя стена, открытая)
bit27 = конкретный тайл мини-карты (нижняя стена, открытая)
bit28 = конкретный тайл мини-карты (левая стена, открытая)
bit29 = конкретный тайл мини-карты (правая стена, открытая)
bit30 = конкретный тайл мини-карты (задача)




Cards (Difficulty: Jedi Knight)


Although we have already achieved what we were striving for, zone maps also look like an interesting object for research. If we return to how the zones are described in the hex editor, we will see that there are several subsections in the zone entries. To analyze the zones, and not skip them, you need to know more about each of them:


It is difficult to see this without loading the file into the hex editor yourself, but the lengths of the IZON subsections are rather inconsistent. The first four bytes after IZON are similar in length, but if we track them, then in fact they will throw us either too close, or beyond the IZAX subsection identifier:


Very often this process is like playing a complicated puzzle (you can believe me, I write them), but in the end it dawned on me: there is a small heading that determines, among other things, the size of the card (9x9 or 18x18), then 6 bytes per grid square, then a 16-bit integer that determines the number of 12-byte structures following it.

Other subsections are just as confusing: IZAX defines its length plus six, the same with IZX2 and IZX3; The IZX4 has a constant length, and the frequently occurring IACT directly indicates its length. Put it all together and get the following:

4 байта: "IZON"
4 байта: неизвестно
2 байта: ширина карты (W)
2 байта: высота карты (H)
1 байт: флаги карты (неизвестные значения)
5 байтов: не используются (одинаковые значения для всех карт)
1 байт: планета (0x01 = пустынная, 0x02 = снежная, 0x03 = лесная, 0x05 = болотистая)
1 байт: не используется (одинаковые значения для всех карт)
W*H*6 байт: данные карты
2 байта: объём записи информации объекта (X)
X*12 байт: данные информации объекта
4 байта: "IZAX"
2 байта: длина (X)
X-6 байт: данные IZAX
4 байта: "IZX2"
2 байта: длина (X)
X-6 байт: данные IZX2
4 байта: "IZX3"
2 байта: длина (X)
X-6 байт: данные IZX3
4 байта: "IZX4"
8 байт: данные IZX4
4 байта: "IACT"
4 байта: длина (X)
X байт: данные действий
4 байта: "IACT"
4 байта: длина (X)
X байт: данные действий
...
4 байта: "IACT"
4 байта: длина (X)
X байт: данные действий


We know that there are six bytes per square grid, that is, there is a high probability that they correspond to what is written in the square of the grid when loading the map. Just by looking at the first square of the first card (0x00, 0x00, 0xFF, 0xFF, 0x01, 0x00), I was almost sure that these are three 16-bit integers, each of which represents a tile placed in this place (because tiles are much more than 256). Having expanded the functionality of the program to connect the extracted files based on this information, we get the following:


Cards! Our guess seems to be true. It turns out that the game has more than 600 cards, and each has its own script and scheme. It's amazing how much effort has been put into this seemingly simple game.



Also interesting is the length of the variable “object information” at the end of the IZON subsection. It seems that it defines additional properties of tiles located on the map. Studying the data as one giant fragment in a hex editor doesn’t help much, but if we tabulate it as 12-byte records and compare it with the map image, some patterns begin to appear:



Тип N/A X Y N/A Арг
09 00 00 00 04 00 04 00 01 00 01 00
09 00 00 00 09 00 09 00 01 00 05 00
09 00 00 00 0F 00 09 00 01 00 09 00
0F 00 00 00 09 00 0D 00 01 00 5D 00
06 00 00 00 0A 00 03 00 01 00 AE 04
06 00 00 00 0A 00 04 00 01 00 AE 04


If we process each record as six 16-bit integers, the first turns out to be a type flag, and the third and fourth - map coordinates. The second and fifth constantly have the same values ​​(0x0000 and 0x0001), and the latter is very messy, which makes it look like an argument. When we start looking at which tiles these records point to, each record of type 0x09 indicates a door, where the argument is the ID of the card that the door leads to in the game.

Finding out what different types of records correspond to is a pretty monotonous task, but by teaching our program to pause interaction when it finds a still unencrypted record type, we can systematically examine each data type in the data file. A bit of logic and guesswork, and we get the following:

Type idDescriptionArgument
0x00trigger location
0x01spawn point
0x02place related to the Force
0x03transport to an additional cardthe card onto which the transport carries the player
0x04transport to the main cardthe card onto which the transport carries the player
0x05the object that the player gives the locator
0x06item boxitem lying inside
0x07Puzzle NPCTile ID of the corresponding character
0x08weapon boxweapons lying inside
0x09door (entrance)card the door is connected to
0x0Adoor (exit)
0x0Bnot used
0x0Cblocking
0x0Dteleport
0x0EX-Wing from the planet Dagobathe card X-Wing transfers the player to
0x0FX-Wing to the planet Dagobathe card X-Wing transfers the player to

Scripts (Difficulty: Jedi Master)


Having dealt with the format of tiles and cards, it would be logical to begin to understand the scripting language that controls the gameplay logic of specific cards. It is filled with ASCII characters, it has fragments resembling commands, so it probably won't be too hard ... right?


Unfortunately, a lot of things happen in scripts that are difficult to decipher. Unlike tiles and cards, which are visually obvious after decoding them, scripts use variables and gameplay states that are not so easy to see and check. If I were very keen on the reverse development of this game, I would suggest that you can load levels and correlate the observed gameplay behavior with the data in the scripts, but I'm not ready for it. Before finishing, I want to try to do something with what we learned ...

Mod creation


The IZON subsection for card number 95 begins at offset 0x266512 in the data file. We can skip 20 bytes to get to the map data, and then skip another 6 * (18 * 9 + 12) bytes to go to the point (12, 11). If then, using the hex editor, we reset the values ​​of the 1985 and 1986 tiles to the “average” square value of each grid and overwrite the game data file, we will get the following:


I don’t quite understand why there is a sports car in the set of game tiles, but I’m sure that Yoda would go on dates with the best girls.

ATTACHMENT: user sehugg to Hacker News reported that in fact this machine from the movie 1978 " Summer in search of the" Corvette " " , which starred Mark Hamill (who played Luke Skywalker). So Easter Egg!

Finally


Obviously, with patience, much more can be done with this data; after reverse engineering most of the game's data file, it will be surprisingly easy to recreate it from scratch (now it works in a terribly small window) or write tools for exporting, editing and repackaging the contents of the game to create mods. I don’t know why you may need this, but there is definitely such an opportunity. At a minimum, I now have access to a cool and comprehensive set of weird looking Star Wars sprites. I hope you learned something interesting about reverse engineering data file formats of old computer games!

Also popular now: