Tale of coursework

Hello. I want to talk about my coursework or what curiosity leads to.


It’s been a long time from nothing to do writing a program for simbian. And from time to time faced with the oddities in the assembly. Everything pointed to the elf2e32 utility. Her task is to convert the input binary file of elf format into another, specific for Symbian - e32 image. I was curious for a long time - how does this utility work at all and why is it sometimes buggy? A little later, I began to pester another question - the topic of the course work =) I decided to combine business with pleasure and downloaded its source code. And it started ...


The first commit is not going to


Second, we include non-standard gcc extensions, add the missing classes, functions, constants from the sources. Subject happily going and falling. Progress however. We start under the debugger - the debugger enters the class that only initializes another, which initializes the next ... Hurray! This function! We enter. Oops. Where are we?!!! Stop debugger! Surgeon! Scalpel! Alcohol! Cucumber! Classes, appendix f furnace! You give nullptr instead of NULL! We have C ++ 14! Wow what awesome constructor initializes everything with zeros! And also, and also, and again - but with us C ++ 14 calls for initialization by default for classes! What is all laconic now ...


Laadno. We fix as much as possible at a time. I figured out why the debugger jumps on the sources of the aki-prichine source code - the author hit his head on abstraction, growing the inheritance of 80 leveled from the UseCaseBase class :) Then, apparently, the constructors of static instances for the Message & ParameterManager classes flew out of their eyes. Singleton Myers? No, did not hear. F furnace abstraction! Viva revolucion !!! Viva POD !!!


Wow! How interesting this tree was growing. The main work is done by the BuildAll () function. If all parameters are specified, the function collects the import library, the file specifying the names of functions and variables and the order in which they are available in the import library and the binary itself. All descendants of UseCaseBase changed its algorithm through overload. Sometimes in descendants we prepare auxiliary data, but more often we simply turned off the creation of some files. For example, the file name for building something is not specified - a new class is created. Idiots It is enough to interrupt the execution of such a collector function if necessary. Easy to understand my actions B-)


We continue to delete empty classes, replacing NULL with nullptr_t, replacing range iterators with for (auto x: *).


We correct errors in the processing of command line parameters.


It is necessary to check the code with a static analyzer. Where to begin? Hmm, under the XPshka, the selection is small - cppcheck, and the codeblock supports it out of the box. Wow, what a catch! There is even a delete for char []! Damn, I know where half the gig of free RAM has gone =)


So we add the files generated from the elf-file libcrypto.dll and the file itself describing the parameters of the command line to create them.


Oops. CPPCheck was wrong ... It must be (a || b) ...


I will try to build in Visual Studio 15 and Win10 would be poked with a stick. We put on a virtual machine. Made, download and run the online installer studio. What? Doesn't want to save the jump to the shared folder with the host ?! Yes, you choke! Download to where you were taught ... And now we transfer the downloaded to the folder and run the installation. What? Again ignores shared folder ??! Yes, you choke! Become where you were taught ...


In principle, a dozen well-whispers on one core and 3 gigs of the frame. Studio in the studio! I wondered, but not for long. Open my project in the studio Again, swears at the folder ... How much can you already? Yes, you choke ... We collect, swears on non-standard extensions STL hash_set. Remote Deleted ??? Turn on the brain =)
Wow what zaboristy code:


int ElfFileSupplied::UnWantedSymbolp(constchar * aSymbol)
{
  static hash_set<constchar*, hash<constchar*>, eqstr> aSymbolSet;
  int symbollistsize=sizeof(Unwantedruntimesymbols)/sizeof(Unwantedruntimesymbols[0]);
  staticbool FLAG=false;
  while(!FLAG)
  {
    for(int i=0;i<symbollistsize;i++)
    {
        aSymbolSet.insert(Unwantedruntimesymbols[i]);
    }
    FLAG=true;
  }
  hash_set<constchar*, hash<constchar*>, eqstr>::const_iterator it
    = aSymbolSet.find(aSymbol);
  if(it != aSymbolSet.end())
    return1;
  elsereturn0;
}

Let's think a little ... And voila:


int ElfFileSupplied::UnWantedSymbolp(constchar * aSymbol)
{
    int symbollistsize = sizeof(Unwantedruntimesymbols) / sizeof(Unwantedruntimesymbols[0]);
    for (int i = 0; i<symbollistsize; i++)
    {
        if (strstr(Unwantedruntimesymbols[i], aSymbol))
            return1;
    }
    return0;
}

My preliness ...


So why the program throws an exception if this flag is incorrectly set or not set at all? Why are you so cruel, beautiful far away ... Let's just drop this flag to a safe value. And this flag would also be nice ... And this, and this, and these. Or maybe it is better to make a separate function? A good idea! Let's call it ParameterManager :: CheckOptions ()!


Step to the left - fall, step to the right - unreported exception, jump on the spot - thanks at least BSOD =)


Dull ... Glitches and curvature ...


Olya-la !!! Emulate CleanUpStack Symbian on STL ?:
In principle, nothing special:


std::vector<char*> cleanupStack;

Cleaning:


std::vector<char*>::iterator aPos;
char *aPtr;
aPos = cleanupStack.begin();
while( aPos != cleanupStack.end() )
{
    aPtr = *aPos;
    delete[] aPtr;
    ++aPos;
}

Some kind of bright head instead of left / right used l / r. Thank you cppcheck.


Ay, lazily in front of the monitor, the cppcheck logs can be disassembled ... What will the gitkhb offer us? .. Codacy ... We are connecting the project ... I have thought a little and are ready! Now you can read messages about success in dealing with errors lying on the couch ^^


So, with the like is not buggy ... Let's collect something, such as libcrypto.dll. It works, although the uncompressed file is more than one hundred bytes than the one created by the utility from the SDK. Further, the binaries created by this version of the utility and from the SDK will be constantly compared. The command line parameters are themselves identical.
Taxes, where to get analog diff for binary files? Hmm, I'm writing a script on the piston. Too much information - you need something much simpler. Dll to recognize pdf / djvu - AlternateReaderRecog.dll - a good option, the exhaust is less than 4 kilobytes. Taxes, offsets are different in the import section. Open them in a hex editor. The beginning is the same, in my version there is more garbage, just after the end of the section in the original version. But in my version the next section starts 100 bytes later. By the same amount of files in bytes and differ! Offsets further indicate the correct addresses ... The binary is correct !!! Ahhhh !!!


A mounth later. So where did those hundred bytes come from?
Well, if it is not clear how it works, we begin to break the algorithm for creating E32Image. We continue to mock AlternateReaderRecog.dll. Increasing the size of the binary at the output - in any way, overwriting the memset of the section - in any way, reducing the size of the binary - in any way. Grrrr. What the?!!! I break the exhaust in the release version, and I launch the debug version? !!! Hi bast, start over ... Soooo, the section is wiped up - good! Increased the size of the binary! Good!!! Reduce the size of the import section! There is!!! Byte-by-bit identical to the same section in the exhaust of this utility from the SDK!
We look into the creation code of this section. "sizeof (char *)" - something was remembered by the articles of Andrei Karpov, one of the developers of Pvs-studio, that types can occupy different amounts of memory - and how much space does it take? MinGW - 8 bytes, Visual Studio - 4 !!! We divide in half these 8 bytes, business. Ffse! And how is the code section? This dllka without global variables. There are no global variables - there is no section either ... Take something heavier - libcrypto.dll.
The file on the output of my utility is now less than 100+ bytes ... What the ??? The import section is byte identical - good. Code section - no? !!


I don’t bother to compare such a wall of text ... I'll go look for diff for a byte comparison ...
After a couple of days of playing with Google, I still found it. vbindiff is a console utility with the Norton Commander interface, showing the difference between two files in two horizontal panels. To go to the place of difference, press enter. Good! You can drag two files to the icon for comparison and the program will open them! Fine!!!
Compare - soooo in the title differ in its crc and creation time. Nothing. That baytik is different, another hundred ... Wow !!! Tens, hundreds, thousands of bytes of difference? !!! Taak, we look at which section they belong to ... We look at the offsets ... Aha, the data section ...
We cranked up the trick, as for the import section ... We reset the memset, there is. Increasing section size ... Falling ... Increasing. Offers the hand and heart of the debugger ... Damn. We open function creating section - porridge from functions ... Grr.
... Aw, tomorrow ... For the time being, I'll fix something else ...


For example, add tests, but there is such a mess that it is impossible to divide the program into small modules. You can’t insert tests directly into the code - then the hell will figure it out. Idea! Constant launches of programs with different arguments - I have been testing the program all the time ... But let's do it better, we will issue a separate python script. Yes, a great idea, just great. The script for test execution errors should continue to work, reporting them but not falling. That's it!


We return to our sheep ... This function calls this, then this, go here ... So, where did it go? Ugh, confused ... ... Ay, tomorrow ... While I’ll fix something else ...


And so it took two months ...


Damn, where is this section of code formed? I had to go on academic leave, so at least I will deal with you !!! Taak. This is where the characters for the section come in ... What will printf show? I’m not putting everything in the console buffer ... Let's save the exhaust to a file ... So, so far nothing special ... Stop! Same lines !!! Many identical lines !!! Where from ?! Add printf to each data source (patience was enough for 3 out of five, ha). Is empty We look at one of the remaining function calls ... Taak. Incrementing iterator after loop ??? And todo on warning codacy ??? Transferred to the loop. Run !!! There is a size match! There is a byte coincidence !!! Fixed!!! git blame the name of the hero refuses to name ... We look at the original - I did not create this. Or was it a “bomb” for non-Nokia developers? Grrr.


Carefully check the exhaust tests, check byte-byte files. Everything works as it should! In the release!


Olya! It is time for a Great Purge !!! It's time to uproot the UseCaseBase tree with the root !!!
Most of the descendants have already exhausted, we bring useful functions to the class generator. Only UseCaseBase and its descendant ElfFileSupplied remain. UseCaseBase - is a wrapper for a class that processes command parameters and declares several pure virtual functions for the ElfFileSupplied class. In short, the violinist is not needed ... What a sky is blue, well ... Another hour ... I will deal with this class and you can go for a walk ... And get some air, warm up, well ... Let's go! So, comment out this feature. We collect! Soooo, you need to think about how beautiful it is to remake ... Done !!! Next feature! Done! Next! Done! Done! Yes! Yes! Yes! The last function ... Ufff. We start after assembly ... Seven-fold acceleration of work? !!! The exhaust is correct ... It's funny. The debug version also shrunk by 2 meters? !!! Wow!!! You can walk. At night?!!! Kaak ??? Where is my day? !!!


Let me write something now ... Oh, the class that works with functions and variables accessible from the outside looks scary. The principle of operation: reading from a file, parsing lines and saving to a file. We have already allocated a whole class of selected noodles in C for parsing strings ... Soooo ... Let's think ... What a beauty came out:
read the std :: getline () string, remove spaces from the edges of the strings and parsim.


To be continued ... The source code is https://github.com/fedor4ever/elf2e32


Also popular now: