본문 바로가기

카테고리 없음

Metroid Prime 3 Russian Translation (Rus to Eng)


I already wrote an article on the subject of amateur translation, where he tried to describe the cuisine of this process.
 But, in my opinion, the experience was not entirely successful. Given the large number of approaches to describe all the unrealistic and excessive generalization does not reflect the fact. So I decided to describe a specific case, is quite revealing, in my opinion. 

Frankly, we are not talking at all about our last translation. The fact that the idea to write this article came to me, when our cause is in decline and the word "last" really meant our final work on this stage. But applying to the project a professional approach and the experience gained as a result of a real industrial practice, I realized that I can live with it and firmly decided that we need to finish one more transfer, which we are expected to look for more than two years. So that under the current circumstances it means rather "the last of this writing." 

It goes about the game on the PC, it would be too boring - because there does not have the exotic romance and reverse-engineering, which is inherent in the entrails of console games. It goes on to play the Nintendo Wii. 

I apologize in advance for being too bloated article and boring second half, but as they say, words of a song can not erase.  


In search of resources


So the first thing I had to get to the game files for the purpose of reverse engineering. Make it should have been by any available, so I drove to the Google request. First turned up under the arm program WiiScrubber, designed for driving zeros unused areas of the image of the game disc, so it is better then compressed (or rather, because of its encryption, compressed somehow), who is able at the same time and to open them. It is worth noting that the program has played a role in the project, but more on that later. 

Open the image and examining its contents, first I drew attention to the large files with the extension * .pak. Sure, they were the priority suspects for storage of game resources, and one of the copies have been prepared for dissection in the hex editor. 


At the beginning of the file has been swept primitive header begins with record types: 

struct PakHeader
{
    uint32 unknown; // version?
    uint32 totalHeaderSize;
    uint64 unknown; // hash?
};


The following useful data starts at offset 0x40, which gave grounds to draw conclusions about data alignment on the block size of 64 bytes. Among these data clearly stood out any identifiers: STRG, RSHD and DATA. To not have a fortune to go - it was obvious that the file is divided into segments or sections. After each identifier should multiple of the 64th - there were two obvious options: either the size or offset. 


As the next block of data has a size of 0x800 bytes, and the same value is followed identifier of the first segment, it became obvious that it's still sizes. 

 struct SectionRecord { char name [ 4 ]; UInt32 Size; }; 

With all sections clearly remains to determine their purpose. Here intuitively helped identifiers - head abbreviations arranged themselves into words «str in g s», «r e s ource h ea d ers» and, oddly enough, «data». Fluent scrolling through all, and considering that section STRG in some archives is empty, I decided to first read the section RSHD. 


Again, it is self evident that this section is a table with information about the files stored in the archive. In the dataset were examined identifiers - TXTR, SCAN, STRG, etc., ie files are typing and that's good. On the basis of intuition and the structure of the table sections, I assumed that the first 4 bytes - this is their number. To see this, I acted as standard: calculated size of the record, considering the distance between the start of adjacent identifiers (24 bytes) and multiplying this same value (0x1CA). The resulting value (0x2AF0) I rounded up to the multiplicity of 0x40, received 2B00. As expected, it coincided with the size specified in the section table. 

Now it remains to examine the actual recording format file. With identity is clear - he indicates the type of file. Apart from him, I expected to find at least some identifier of the file, as well as its size and the offset in the archive. 


By visual analysis, I identified some patterns. The first number is always equal to 0 or 1, and therefore most likely that it is a Boolean value. Given that this is an archive, it is quite possible that this is the flag of compression. Next went identifier, which I understand. Behind him - 8 bytes chaotic information, very similar to a hash and it looks like and is the identifier of the file. The last came two values: both multiple 0x40, but the latter increases with the number of records. The conclusion came by itself: the size and file offset. Since the displacement of the first file is zero, it is clear that it is relative to the beginning of the segment. 

Jump to the specified offsets and sizes popribavlyaya them, I was convinced of their guesses and you only write it in the form of event structure: 
 
struct FileRecord
{
    uint32 packed;
    char   type[4];
    uint8  hash[8];
    uint32 size;
    uint32 offset;
};


Next I was interested in the actual data files. Information about their position in the game and the size was known, so I sketched a quick simple extractor and took out the contents of one of the files as they are. After reviewing a couple of files, I realized that they really packed. But I already knew the algorithm - many years of experience of such activity has had an effect, and took one look at the box the hex editor. It remained only to make sure that the experience does not deceive me, and I fed one of the files programmke stuns , perfectly revealing some common compression algorithms. I was not mistaken, it was LZO . 

Before writing the decompressor, it was necessary not only to identify the compression algorithm, but also to find out data structure. I took one of these files and began to dig. Thanks STUNS I was aware of the position of the compressed data stream, but in addition it was present in the compressed file and other information. 


First of all, it was obvious that all of these files have a signature - CMPD. One had a value occurring in three versions - 1, 2, and very rarely - 3 I thought it was some type or version of the format, at first did not give it any significance. The next byte is stored in the currently unknown value, and a further three - file size minus 16 It gave me reason to believe that the first 16 bytes of the file are heading. Poanalizirovav values ​​unknown bytes, I saw that often differ only one or two bits, and decided that it was some flags. It was followed by four more bytes equal to the size of file, spit me STUNS'om - the conclusions are obvious, is the size of the original data. 

16 bytes come to an end, according to STUNS still two bytes began the very compressed stream. I was sure this is one of the classical schemes: two bytes specify the size of the piece of data, followed by the very portion of the packed data, and so as long as the flow is not over. Indeed, the values ​​of these two bytes coincides with the size of the flow. 

After processing this information, I was heading structure. 

struct CompressedFileHeader
{
    char   signature[4];
    uint32 type;
    unsigned flags    : 8;
    unsigned lzoSize : 24;
    uint32   dataSize;
};

Now it was clear how to read these files, and I reached for the good old miniLZO . Due to its mini-frameworks and the widespread use of home-made abstractions for threads, the implementation of unpacking LZO-flow did not take much time nor lines of code. 


while
(streamSize > 0) { const uint16 chunkSize = input->
readUInt16(); streamSize -= 2; input->read(inBuf, chunkSize); streamSize -= chunkSize; uint32 size = 0; lzo_decompress(inBuf, chunkSize, outBuf, &size); output->write(outBuf, size); }


Maximize your primitive extractor to the rank of raspokovschika, I first made sure that properly understood the appointment of a Boolean field in the table files in the archive. It turned out that, indeed, the files in which it is false, not subjected zapakovke. Reassured, I took into account this in code, unleashed program from one of the large files and became a satisfied look at the progress ... 

But to successfully complete the work program was not meant to be. Trying to unpack the first file TXTR-crowned crash.It was then that I realized I did not gift present some versioning of files. Opening one of the compressed textures in the hex editor, I began to look for differences. 


Of course, as the type here was no longer one. First of all struck by the difference in the amount of the title: it's bigger on the 20 bytes. In this case, the title is now attended two values ​​12 and another 12 new bytes. I guess there was one, and I climbed to seek an example of uncompressed texture. Finding such, I looked at her and saw the beginning that it is very similar to those of 12 bytes, which together with other added to the header. 

Everything became clear: the compressed files of this type can be read a certain amount of information without having to unpack - just part of the data stored in its original form just before the stream. In the case of textures unwrapped left their title, ie could find no extra computations width, height and type of such textures. Very resourceful, it's worth noting.But I am confused that the size of the data specified in duplicate. Prosherstit all available compressed textures I did not see any other option, and I realized that it does not play a role, but for myself suggested that the second number - or the amount of data taking into account the alignment of 4 bytes, or the position from which to start to unpack the output stream. 

I started unpacking again, and this time he consulted, but in view of the large number of files it took quite a long time.Then I added to it the option of selective decompression, file type, and began to drive out all the files. I was expecting one more beautiful, because of another type of compressed files, but it turned out that none of the selected file is not packed so I decided not to waste time on unnecessary research - front and so had a lot of work. 


enum
CompressedFileType { typeCommon = 1, typeTexture = 2, typeUnknown = 3 }; struct StreamHeader { unsigned flags : 8; unsigned lzo_size : 24; uint32 file_size; }; struct CompressedFileHeader { char signature[4]; uint32 type; if (commonHeader.type == typeTexture) { uint32 uncompressedChunkSize; uint32 uncompressedChunkSize2; // always equals to uncompressedChunkSize } StreamHeader streamHeader; if (commonHeader.type == typeTexture) { uint8 uncompressedData[uncompressedChunkSize]; } };


At this remains only to deal with the section STRG. It began with the value 0x4C, then cyclically alternated data: null-terminated string type identifier and the hash that is present in the file table. Of course, considering the number of such cycles, I got 0x4C. Of course, these were the names of some files - perhaps for those files that are requested not to hash, and by name. 
 
struct NameRecord
{
    char  name[]; // C-string
    char  type[4];
    uint8 hash[8];
};

In fact this entire section - serialized mapping hashes (or pairs [hash, type]) to the file name. 

I now had access to all the resources they need, and it's time to explore them. 




In pursuit of the text


Then it was necessary first of all to extract the text to translate it Dangaard began, as I continue to deal with the technical part. Yet before I figured out the archives, I tried to find signs of the text in the files of type TXTR, but, as it turned out, TXTR not mean «text resource», but rather derived from the word «texture». Of course, the text is stored in a file type STRG. 

I much navidalsya, analyzing text data serialization techniques, therefore, opening one of these files in the hex editor, I was ready for anything. And not in vain, as it turned out, because the case turned out to be non-trivial. 


My eyes immediately caught the string «ENGLJAPNGERMFRENSPANITAL» - listing six languages ​​supported by the game. In the text file in the second half it was clear that there really are in the file line in all these languages. Everything that should have to the language identifier hurt looked like a header, and other files, the picture was the same, so I'm in this condition is satisfied: the first 24 bytes - header. 

Then I started to compare the facts known to me with the values ​​of this header. For example, it was obvious that the file contains data from two rows of six languages, so the semantics of values ​​of 6 and 2 did not cause any issues. Similarly, everything was obvious from the first four bytes: signature - a common occurrence. However, the next 4 bytes, though seemed a harmless version of the format due to lack of communication with any other data in the file, but after the experience with compressed files suspicion I still aroused. Just in case I looked at a couple of dozen files, and all saw the same top three, leave it at that. Finishers heading two zero values, which I copied to the reserved field, quite common in binary formats. 

 

struct
Header { uint32 signature; uint32 version; uint32 langCount; uint32 stringsCount; uint32 reserved1; uint32 reserved2; };


Identifier list of languages ​​I understood. Next, logically, followed offset table rows, and then do a line. Well, it is necessary to pick open the case. Total file six languages ​​and the two lines, but in the table there is not 12, but 18 values ​​- so for each language there is one more additional value, in addition to the displacement for the two lines. 

By cyclical ascending values ​​shows that the bias for each language are grouped together, and not, for example, alternating shifts with other languages. But each group is preceded by an incomprehensible value, and then follow the lines with respect to the offset of the first row. The lines that are stored in the same way as done in Delphi: first there is a length of string, then its data. When this null-terminated string. Returning to the unknown value, I could not help but check out one guess and summing the lengths of all the lines of the first language. Why, it's really the sum of the lengths of the lines first! Now all became clear. 

 

struct OffsetTable { uint32 totalStringsLength; uint32 stringOffsets[header.stringCount]; };

 

It is time to write a deserializer. In the text of the European languages, it was obvious that the use of UTF-8. At that time I did not long ago jumped from Delphi in C ++, and because of some developments in Delphi, and also because knowledge of VCL and only a smattering of STL, often sinned in that part of the toolkit written in Delphi. Because at the same Delphi I had an old proven library of serialization text readable and easily editable format, and then I decided to resort to this reception. To my shame, though I tried to resort to the PLO and other benefits of civilized kodopisaniya but often sinned lapshevidnym code, so when writing this article was having trouble remembering some of the details. 

In general, necessary for decoding UTF-8 section of the documentation WinApi been studied deserializer was written and after a short debugging tested on dissected file. Ok, it's time to remove the game script! I extracted from the archives of all the STRG-files and incited them deserializer ... Of course, I was waiting for the trick, and he was. Something I did not realize, and again caught the crash. 

I opened the problem file, and saw that got a little excited, denoting some header fields as reserved. 


choice0, choice1 ... Why, it's the string IDs! This means that some lines can be assigned to such identifiers here to get access to them is not on the index, and in this same identifiers. The first "reserved" field turned out the most number of these identifiers, and then followed the size to place further section. 

 
struct Header
{
    uint32 signature;
    uint32 version;
    uint32 langCount;
    uint32 stringCount;
    uint32 idCount;
    uint32 idsSectionSize;
};


At the beginning of the section located table, which turned out nemudrёnym array offsets and index rows for which these identifiers are intended. For table followed identifiers themselves as null-terminated strings. 
 
struct IdRecord
{
    uint32 offset;
    uint32 index;
};


It's time to correct deserializer according to new data and start the process again. At this time, the problematic files to swallow without problems and progress went on. I was already preparing to celebrate the victory, but my sense of satisfaction with the work done broke off another crash ... Well, do not get used. Again repeat the standard procedure, opening the problem file in the hex editor.


Still, I was not wrong with the header field that holds the version of the format. But I did not think that the other version of the game are generally used. I had to disassemble and this version, the benefit of it was pretty easy. The first 16 bytes of the header were identical to the previously investigated format. Then, immediately followed by information about languages, but in a slightly different form.

struct LangSectionRecord
{
    char   langId[4];
    uint32 offset;
    uint32 size;
};

But more precisely followed the last 8 bytes of the header and section identifiers in the same format. Means and in the format of these 8 bytes were not part of the title. followed The last section of linguistic data. At the beginning of each section is a list of offsets lines. Lines of this time were stored in Unicode and were just null terminated, without specifying the length.

struct LangData
{
    uint32 offsetTable[header.langCount];
    uint32 size;
};

 

I added support and the format and re-launched the deserialization process ... the last time. Now everything worked smoothly and in my arms was extracted game script.


Fonts


Texts were sent Dangaard'u to inspect, and it was the turn of fonts. I had previously noticed a couple of files to the type of FONT, and I have no doubt that they keep many fonts.


Typically, fonts are composed of two components: a raster and information on how this raster display. But by looking at the file, I realized that it has no raster. This means that it should be stored separately - e.g., in the texture. Hence, it was necessary to look for signs of any reference to the raster. The first thing I started to visually analyze the contents of the file. At the beginning of the name of the font is easily visible - «Deface», as well as following it hash. I decided to look for files with the same hash and really stumbled on the texture. Remarkably, the raster is now at hand, but first must deal with this file. headline was tricky, and I decided to leave it for later. A little further seen before the font name and hash could see the texture data, which shows some periodicity. Peering closely, I picked up the desired width of the window the hex editor and saw that it was an array of structures.





Judging by the first field, where, no doubt, kept the character codes, it is information about the glyphs. In front of the array of classic was his size. I started looking at this array of structures characteristic elements, and the first thing to note - it is clearly distinguishable four float-values ​​in a standard format to IEEE 754 . they are distinguishable by the structure size, presence of bytes 3D, 3E, 3F, and other similar variables with an interval of 4 bytes - a large gathering of floating point numbers this pattern is hard to miss. The actual values ​​ranged from 0 to 1 and it became apparent that this texture coordinates. The semantics of the other values, not seeing the texture, I could not determine, therefore began to decode bitmap. Putting aside the analysis of the structure TXTR for later, I decided to just detect and treat raster. In such cases, I always check first whether the common raster array of colors or indices in the case of images with indexed colors. And by opening the file in a tile editor and arranging color, I saw an interesting picture.




Image was chetyrёhbitnym, but it is obvious that there were two pictures. Since the values ​​of selected width and height were found at the beginning of the file, in fact about as she should look and texture, that the withdrawal did not have to look far: in the texture cunningly kept two layers. Iein each pixel, instead of storing chetyrёhbitnoe pixel value stored two different colors - one for every two bits. The layers themselves, respectively, four-color. Each layer is a textured satin - ieon it was a lot of the glyphs and the records were given uv-coordinates corresponding to the symbols of glyphs. font editor for a long time I lay workpiece on Delphi, inherited more year in 2006 from a "colleague." I quickly sketched decoder raster and started trying to load font, different interpreting field in the structures of glyphs. After a few days of experimentation, I had some idea, and even quite successfully downloaded font.






Since convert fonts one by one, I came across another type of texture. This time they kept as many as four layers instead of two, but still on the same principle. The difference was that the layers were not four-color, and binary . Overall, the picture gradually cleared up, and the glyph learned almost everything.




struct Rect
{
    float left;
    float top;
    float right;
    float bottom;
};

struct CharRecord
{
    wchar_t code;
    Rect    glyphRect;
    uint8   layerIndex;
    uint8   leftIdent;
    uint8   ident;
    uint8   rightIdent;
    uint8   glyphWidth;
    uint8   glyphHeight;
    uint8   baselineOffset;
    uint16  unknown;
};



With that knowledge was actually dismantled and unjustly forgotten title. Comparing the facts known to me, I could not figure out the purpose of most of its fields, but it was no matter to edit the font can be without them.
 


struct
Header { uint32 maxCharWidth; uint32 fontHeight; uint32 unknown1; uint32 unknown2; uint32 unknown3; uint32 unknown4; uint16 unknown5; uint32 unknown6; char fontName[]; uint64 textureHash; uint32 textureColorCount; uint32 unknown7; };

It seems to be, it's time to work on the preservation of the font, but then I noticed a strange addition is not stored at the end of the file. 


With some pairs of characters have been mapped to some value, often 0, 1 and -1 ... Connoisseurs typography probably already guessed what it was about, but I did not immediately understand, and even raised the issue on the theme forum.But I soon began to realize, and Googling, came across a definition of the required ... Yes, it was kerning in pixels.
 


struct
KerningRecord { wchar_t first; wchar_t second; int32 kerning; };

All of these records were sorted in ascending order of the character codes. Now's the purpose of the last field in a record glyph became known - it was the index of the first record kerning for this character. was now possible to organize the conservation of this font. I did not take risks and increase the size of the texture atlases, so the main problem here was placing all the glyphs in a limited area. I asked in the topic about kerning, do not know if anyone uncomplicated algorithm for this purpose, and in return they gave me a link to an article on gamedev.ru about packing light maps. It was the right thing, and after the implementation of the algorithm fonts can be preserved. But after changing the font turned out that even with this algorithm is not so easy to put all the characters on the available to the area - part of the glyph just does not fit. Output, however, found once I screwed optimization, which used the same part of the texture for characters with the same glyphs. Poshamaniv bit, I removed the non-essential differences between some characters like Cyrillic and Latin alphabets, and they have common glyphs, so fonts started successfully saved. So another format fell to my fingers, and that it was possible to come to grips with the study of textures.

 


Relationship with textures



The game does not contain a large number of textures that need translation, but at the time I did not know and have worked with taking into account the worst-case scenario. A little digging in the extracted TXTR-files, I realized that the texture of fonts - the degenerate case, and most of the textures do not use indexed color.


Parse the header size of 12 bytes was not difficult. The first value is the number of certain types of texture, as texture indexed color it was otherwise. It was followed by the values ​​of 640 and 480 - definitely a very familiar number, is not it?In general, it is permissible in the title, I had no doubt, but the appointment of the latter values ​​are not so obvious. But I could have sworn that this amount mipmap-levels , which are so common in such cases.
 
struct TextureHeader
{
    uint32 type;
    uint16 width;
    uint16 height;
    uint32 mipmapCount;
};

The title has been studied, but the main problem was to determine the size of the raster. 
Fortunately, visually similar structure was very familiar to me - it was definitely an algorithm from the family of 
S3TC , many familiar Container DDS and algorithms DXTn. These formats are designed for use compressed textures in video memory and the like as the hardware is supported as a Wii, and Gamecube. So, notwithstanding the association with DirectX, nothing surprising here. Identify specific algorithm was simple: all of these formats are strictly fixed compression ratio, and from the ratio of the size of the input raster to compressed to 8: 1 (or 6 1 without taking into account the alpha channel), it could only be DXT1. It's time to just make guesses, and I created a DDS-image of the same size and resolution, replacing raster borrowed from the texture. But what I got when opening the resulting file was not quite meaningful image ...

 

It is time to poraskinut brains. Once the output is something remotely resembling the picture, it was obvious that it still DXT1, but modified. Came to mind two factors that could affect the format: byte order and structure of the tile texture.But in order to test this, I had to manage most of the decoding procedure, and for this purpose I started looking for any library. Fortunately, I just came across a wonderful library of open-source - NVIDIA Texture Tools . Not without problems, but I still was able to adapt its code for their own purposes, and began experimenting. First of all, the first guess was correct, though in a different order were not bytes, and whole words, as well as the bits in them. After some manipulation of the permutation image has taken a more appropriate form.

 

I realized that about a tile structure was not mistaken. But, given that he DXT1 encoded blocks of 4x4, I was only partly right: there was used a svizzling tiles, ie they were rearranged in a certain matrix. And, of course, the image was inverted.I was too lazy to strain your brain and restore the matrix, and I went for a long time nakatannomu way - trying to pick up the formula for the translation of coordinates. Frankly, I do not know how I have this turned out, but I was acting on intuition, and in view of the laws in this matrix that my subconscious could feel without me, so in the end roll. And then, finally, I got the original image.

 
Write a coding procedure was not a problem - copy-paste function to decode and, in fact, the replacement of r-value on l-value did the trick. That's all basic formats were dismantled and could breathe a little more freely.




The first sample


Could not wait to have something to test in action, and I began to think about how to provide a substitute for resources in the game. Of course, the first thing should think about repacking archives. It would be foolish to spend hurl CPU time and resort to repacking unmodified files, so I started to write a function to rebuild, wrapping only the necessary data. essence was as follows: the archive opened and carried to iterate over all of its files. Fed to the input of a list of directories to be found their modified versions. If there are any, it is packed and written to the output file, or else the original compressed file copied unchanged. Incidentally, in the process of implementing this functionality, I noticed an interesting thing: not only that the same files are often present in multiple archives, so they are more often present in the plural and in one archive! However, I was not surprised, because this is easily explained - so you can reduce the number of movements of the head above the surface of the disk drive. In search of a file to read the game chooses the one that is closest to the direction of movement of the head, thereby less wearing out the drive and will boot from. In other games, for example, used the reception when all the resources of the level just packed in a separate file, which is then in its framework and used. Of course, this leads to large amounts of duplicated data, but it called excessive language does not turn, because everything is more than justified. When the required functionality was ready, I decided to try my luck, by feeding the archive with the changed data game. Taking the previously mentioned WiiScrubber and rebuild one of the least weighty archives, I tried to replace him. But I was disappointed: the size of the original file has been exceeded. Much less than what I wanted most was the deal with problems of rebuilding the image, so I started to look for workarounds. And I found them: the solution turned out to be similar to what I used in the case of fonts. Since the data of other languages ​​I was not interested, I added a function to rebuild the opportunity to ask some mapping files to others.





Ieif once processed file was flagged for display, its contents will not write, and it affects only entry in the file table - offset and size changed to the offset and the size of the target file. Thus, for example, I "zamapal" all options initial image screens in multiple languages ​​on a single - in English, which was destined to become bilingual. Again, I rebuilt the archive, and this time it will fit. So I made ​​the first test run of the game with the modified resources ... Not to say that everything turned out and simply freezes - that I'm not the byte order is used, then all forgot about the lumpiness at zapakovke in LZO, and with fonts and text problems arose, but in the end everything was adjusted and the game is finally swallowed rebuild the archive with no apparent problems. Hooray, it worked: I saw the Russian text. But it was not the end of the road, had to do a lot more.






Organization of the process


It was important to establish a process of translation, and to create at least some semblance of a civilized approach. At the time of the version control systems I knew only by hearsay, and we decided to use Dropbox , creating a shared folder for the project. Meanwhile Dangaard finished work on the first version of the glossary, and it was time to think about automating the process of inserting data into the game. The main problem was to replace the files in the image. On the use of any script, dёrgayuschego external programs could be no question - these programs with Command Prompt I was not known, and aesthetically I would not take such an approach in relation to this specific operation. Especially, from the beginning I was planning to create executable patch for the image, so in any case would have to resort to the software implementation of such a possibility. Well, if so, then we must seek an open-source and the closest to the required capabilities. At first, my mind fell on the Dolphin - just wonderful and the only existing emulator Gamecube and Wii. Of course, he knows how to work with images, so I pumped his source, and began to study them. But the architecture was quite difficult, and I decided to look for alternatives. Suddenly I had a single idea, and I drove in Google «WiiScrubber sources» ... Bingo!



They do exist! 

I must say, it was the beginning of one of the most difficult stages in the history of the project. If I knew that, I'd rather be sat down and wrote it all himself. There is nothing worse than to adapt very, very narrow-code for your needs. That's when I practice fully felt the importance and the need to follow patterns and compliance culture code. After all, I literally had to surgically cut out the functionality of the graphical user interface. The code was written using MFC, and logic has been integrated into the GUI, where, in addition, some of the data stored - for example, the displacement of the files Pars text nod
es TreeCtrl. I had to replace the spirit and write the fake audit classes to all somehow gathered and worked.Together with and so not quite clean code that turned into a bloody mess of crutches, dirty hacks and antipattern, which sooner or later would have to rake. Nevertheless, the goal has been achieved. Code was bleeding, but served its purpose - I was able to programmatically replace the files in the game. Now I had to write an internal tool to quickly insert data into the game. And because she had to use not only for me, I had to do it with a simple and intuitive graphical interface. I do not really want to deal with all kinds of frameworks or climb into the jungle MFC or WinApi, so I decided to re-use their knowledge of the VCL. I took Borland C ++ Builder 6.0 and began to write. I must say, positive emotions from this IDE was obtained in the course of 
a little. 
I do not know what to blame, but without a complete rebuild of the project each time the program does not even start. Plus constant "shrёdingbagi" not allowed to localize the location of the error, with the result that I always had to Review all the horror that I had to create before. And had to work on the code to Borland was able to understand it. dealt with the problem when compiling, I went back to the problems of functionality. To the process of replacing the file was the most rapid and less redundant, it was necessary to tie the choice of specific files to replace. The contents of the archives I have identified some of them at some locations used, and created a list of files with the appropriate markings. Now, to verify the text in the menu of the game did not have to resort to a complete rebuild of everything. Next, I sketched out a script that serialize resources in game formats and put them in a folder, which is then given as input to the newly developed program there. Now you can at any time replace the resources in the game, without applying excessive force. At the very least, but the process of "assembly" translation was automated. Now I had to make life easier for the interpreter. I noticed that many files with different names (or rather, hashes) have the same content, iethe text is too often been duplicated. Therefore, the output text files I "imploded" all such duplicates in one, asking them composite names. A serialization simply getting a bunch of files instead of one. then I noticed the abundance of technical data in the text. He was simply unreadable to the human eye because of the huge number of tags.



&push;&main-color=#E67E00FF;Energy Cell ID:&pop;
SN-3871S-7

&push;&main-color=#E67E00FF;Status:&pop;
&if=hasitem:Fuse7Used;Used&else;&if=hasitem:Fuse7;Acquired&else;Unknown&endif;&endif;&if=scan:SCAN,0x6479E69556A56AC8;
&if=(hasitem:Fuse7Used)|(hasitem:Fuse7);
&push;&main-color=#E67E00FF;Previous Coordinates:&pop;&else;
&push;&main-color=#E67E00FF;Coordinates:&pop;&endif;
&if=mapped:PirateCommand;04P-MET, Pirate Homeworld&else;04P-MET, Unknown&endif;

&just=left;Data indicates Energy Cell is connected to &push;&main-color=#FF6705B3;processing&pop; containment core.&endif;


The benefit of such problems had to be solved before, and I sketched xml-file with a description of the syntax for Notepad ++. It was very useful that Dangaard translated text is in it, because the Word and other text processors with his AutoCorrect and other can cause a lot of problems. After these manipulations perceive the text was very simple - the problem was solved.


Later, it took a little more zamorochitsya text. Dangaard approached me with a request - it was incredibly tedious to maintain a glossary of names and locations all over the place to use it in translation, so I was required to automate this process. I sketched a program that reads a file with a glossary, Pars files with text and if the line is fully consistent element glossary produced its replacement by a translated version.Translated went full speed and load relay intercepted Dangaard, transferring large amounts of text of a technical nature.






Imaginary finish


As time went on, progress is also not standing still. And so, at some point Dangaard reported to the completion of the transfer of possession of the magazine - the most complex and bulk of the text, in fact there is a description of almost all gaming facilities and an extensive collection of mini-articles with their more detailed consideration. This undoubtedly meant preponderance progress toward completion of the transfer. Around this time I started looking for ways to create executable patch LBW. The main problem was to find the means to develop problem-free GUI-applications, while maintaining a small size and solidity of the executable file and decent design. In the process, I met a lot of criticism - many believed that in our time should not pay so much attention to the size of the file, and in general, it's better just to distribute the patched image via torrents. But for me it was a matter of principle, because of the fact that the user interface of the patch takes up more memory than the data itself is a patch, it hurts hit my sense of aesthetics, so it was a challenge. In addition, an important role was played by the legal side of the issue: I do not want to violate anyone's copyright by distributing an illegal copy of the game. For the same reason I even have a license disk prikuplyu. took a little more time, and it was possible to look forward and not so distant completion of the works. Pretty tired of the project, we decided to pre-announce a set of testers. The fact that we found some kind of problem: our translated text does not always fit in the allotted screen space, and therefore we decided that the sooner we begin to detect and correct such places, the better it will be for all. The more time passes, the less enthusiasm we had in the bins. Frustratingly was that responded to the call of only one person. Nothing discourages as knowing that your labor is not so in demand, and the work continued with less pleasure. The situation is compounded by the need refactoring code to replace the problematic images in the game, because if it and godilos for internal use, the release of such production would not allow a conscience.Another factor was the need for latching create patcher that meet the requirements. Nobody else has not responded, and under the weight of our circumstances have deteriorated and it all fell, as so often happens. There were many attempts to transfer back into operation, but lacked enthusiasm for long. Once in grёzah of voice acting, I even make out the format in which the stored speech sound - it was one of the containers library versions FMOD . However, the full sound I did not achieve, but it's not the complexity, and the lack of need - I initially knew that good voice acting on the knee will not.












Second wind


Since then, two years have passed.During this time I got a job, graduated and got a lot of experience in their specialty.Unfortunately, carefree days are gone, and their hobbies after work had too little time and effort. But all this time the conscience tormented by the thought that we have unfinished projects, once promised to society. At the head of the vital idea of "right", but as soon as it came down to it, it became clear that the thought is not enough. Dangaard periodically something gradually translated and poked me with a stick, which also made ​​me worry. At some point, I looked back at their achievements, and decided that something must change. And I began to change everything, trying to bring everything in line with the requirements of any adequate business process. First of all, the reorganization has touched our method of data reconciliation. Use Dropbox for this purpose would not do anywhere, and I looked in the direction of version control systems. Whatever was said, but I like Subversion, so I created a svn-repository on one of the hosting sites and began to transfer data there. Immediately problem surfaced. The text is stored in Unicode as Notepad ++ not always properly maintained UTF-8, which periodically led to a deterioration of some characters. Perhaps because of the BOM SVN recognized as binary files, so I had to manually set the text content type. After these manipulations, the text was accepted without a problem, but every now and then more problems arose - that the utility version comparison did not take Unicode, when conflicts arise, the relevant text labels prescribed in ANSI-files as text. But to change something it was too late, and this kind of problem does not arise as often, so everything was left as it is. Once migration to SVN was completed, I was surprised to see that he got pleasure from the process. Work with ordered data at a higher level of organization proved to be one fun compared to the previous experience, so I decided not to stop and turned his attention to the tools. I decided that I need to rewrite everything, that it is impossible to refactor. During the year and a half of experience manufacturing practice, I was very fond of Qt, so it was decided to use it here. There were also all the necessary projects and work has begun to boil. The first time I even covered all overwritten test, but then I got sick and I decided to focus on the correctness of the output, and not on the correct operation of the code. were rewritten all the instruments, except for the editor font and text serializer - they worked and worked correctly, and needs in the implementation of anywhere this functionality has not yet occurred. Through time and had to learn it before the bloody mess that I had to give rise to the possibility of replacing the files in the image. In a couple of commits this code has found at least some form, and I was able to afford to use it to create a patcher. Trace and vstavlyalka data has been completely rewritten using Qt instead of VCL.

 

I decided that for the patcher will use Qt, and killed a lot of time trying to collect it in a static configuration with a minimum size of the resulting libraries. However, even in the build system and is supported by Qt «fine tuning», but in fact whether my hands were crooked, whether this feature is just not working. I have already started visiting thoughts, and whether you can simply cut out the unnecessary libraries me the code and data? And I asked the question here, Habré.Hopes were justified - saved me MikhailEdoshin , telling that in MSVC flags are present, allowing a special way to compile the code and specify the linker that he cut unneeded data. After the test of this method, I realized that it works fine.patcher I created the project and started developing it. Two years ago I wanted to see in it as something resembling a hacker hack, but now it was the desire to see a solid installer, to the maximum enclosing authors from potential legal problems. So I took QWizard and started to do it on the patcher in the best traditions of the installer.


 
It seems that everything went well, but soon there was a nasty poser ... 




Unexpected hang-


At some point, the game was just hanging out after the main menu. It was terrible, because before that everything worked and the reason for hanging could have been anything. Replacing the archives through WiiScrubber give the same effect, so that the blame specifically patcher I could not. Much time was spent on scrolling through the files using kosyachny dichotomy, but in the end it turned out that once at the time and it is not necessary for any change in the game hangs a different file . And then I decided to check out a hunch, which, in principle, from the very beginning was kept in my head, but was dismissed as unlikely. To reveal the essence, we must first tell a little about the device drive Wii. While the usual DVD-ROM is divided into sector size of 2048 bytes, the disk is divided into clusters Wii at 0x8000 bytes. The clusters in turn are composed of a header size of 0x400 bytes, and data size 0x7C00 bytes. Clusters are themselves assembled into subgroups of eight pieces, which are then collected in groups of eight subgroups. Title same cluster keeps hashes, hashes of hashes, hashes of hashes of hashes, and ... well, everything in order.

typedef uint8 Hash[20];

struct ClusterHeader
{
    Hash  h0[31];
    uint8 padding0[20];
    Hash  h1[8];
    uint8 padding1[32];
    Hash  h2[8];
    uint8 padding2[32];
};

struct Cluster
{
    ClusterHeader header;
    uint8         data[0x7C00];
};

Firstly, the table is stored in the header of the 31-th hash SHA1, called H0. She keeps a hash of each 1024-byte block of data in the cluster. second table, H1, keeps a hash table H0 each cluster in the subgroup. Because this table is available at the title of each cluster, then within one subgroup all clusters its content is the same. also exists table H2, which keeps a hash table H1 all subgroups in the group. Therefore, it is, in turn, is identical for all the clusters in the group. Finally, each partition on the disk has a global hash table, which stores a hash H2 tables each group of clusters. This table, along with the rest of the header information in this section is protected by a digital signature that the chain of de facto protection and tables of other levels, including as a result and the data clusters. They, by the way, with more and encrypted. Any change in the way it is necessary to recalculate all the hashes and re-sign a global table of hashes. But the private key to sign anyone other than Nintendo is not known, so the workaround is used, known as the signing bug .fact that the Wii firmware was discovered bug: when checking the digital signature hashes were compared using the strncmp. This feature is designed for line, and has a special feature: if at the beginning of both chains compared data has a null byte, they are taken as blank lines, and as a result, are considered equal. Therefore, to run on the Wii modified content using pre-prepared digital signature hash that starts from scratch. To hash content also started from scratch, made ​​fit data by changing the reserved fields. Knowing this, I wrote a check function, which considered the hashes and compare them with these. And really - at some point check had fallen due to an incorrect hash. Problem with hanging proved that in the source WiiScrubber'a had a bug in which in some cases has not been updated recently processed hash cluster. After removing all the bug began to work, which I was very pleased. Given this unpleasant experience in the internal assembly of patches I embedded mandatory checks the validity of the image itself and relinked archives. In the event that a patch would simply does not apply.


















Testing and release


Problem with hanging was fixed, the build process has been automated patch, translation now really is nearing completion, and we decided to re-recruit volunteers for testing. At this time, responded to enough people, although in the end is really involved in the testing of only three of them, including me. I gave patches and the process went ... Before that I have tried to prevent the majority of technical errors that in the process of assembling the patch added a couple of tests. If they Were filled patch just was not going to. But problems still flowed freely. I barely had time to process bug reports, as there were new. When this was reported not only error, but also on any ear cutting proposals, so testing soon developed into a detailed processing of the text. turned out that the problem does not fit with the text had a much larger scale than we could imagine. Due to the nature of the Russian language, the translation gets long and simply will not fit onto the screen. Therefore, gritting my teeth, I sat down and wrote the game script editor with the possibility of an imaging text.


 

One of the testers, a person under the nickname SonyLover, was previously soperevodchikom in one of our projects. He offered his help in bringing the translation until the end, and together we began to rule literally every sentence. Meanwhile discovered a nasty problem with fonts: some characters seemed superior to others, creating the effect of "leaping" text.Looking closely, I saw this problem in the original, but there it was less pronounced and therefore less noticeable.

 
The reason was simple and something similar has already been described in the Habré: symbols adjacent to the edges of the texture are not subjected to the interpolation with the adjacent side, and therefore looked sharper characters who were further away from the edge. Visually, it created the effect of the difference in the height of characters. However, the problem was solved by the simple addition of a framework of transparent pixels, so that no one character is no longer rests on the edges of the texture. At this rate, we worked a half months, polishing the sculpture points and arguing on every detail. And now, we are very close to release. Though at the last minute and we had to resort to a rather global changes like changing a couple of elements of the glossary or untranslated detection system messages sewn into the game executable, but forces them to check no longer exists. And finally we come to the solemn day. January 26, 2013 the first year I made ​​a recent edits otbranchevalsya repositories, put together a patch with the release configuration and announced the release of the translation. To say that the mountain had been lifted from my shoulders - to say nothing. Of course, I did not feel the indescribable joy, but in conscience was one less reason to pester me. course, the translation came out perfect - there is no limit to perfection, and immediately after the release showed a couple of unpleasant, but not critical moments with text formatting. But to release a new version - it is five minutes, so it did not really upset. The main thing is that our work we were satisfied as our target audience.






Afterword


Returning to the title, I want to talk a little bit about our plans. More recently, I was sure that would not come back to the translation work, but obviously I was wrong. Then I thought that this transfer would surely be my last, but then decided that we should complete at least one more long-term construction, which for two years waiting for us to hundreds of people - Translation Silent Hill: Shattered Memories . And we'll see whether the soul has to lie to amateur translations. Still, no wonder many people, growing up, go to the scene. Frankly, I was no longer sufficient to meet this aesthetic needs, as no longer enough, and with the emergence of other concerns. Perhaps it is time for something bigger. For example, to fulfill the dream of childhood and youth, having been engaged in the development of games. Who knows, maybe someday be able to issue its own little masterpiece. This thank you for your attention, my story ends here.