“Hello World!” In C array int main []
I would like to talk about how I wrote the implementation of “Hello, World!” In C. For heating, I will immediately show the code. Who cares how I got to this, welcome to cat.
So, I started by finding this article . Inspired by it, I began to think how to do it on windows.
In that article, screen output was implemented using syscall, but in windows we can only use the printf function. Maybe I'm wrong, but I haven’t found anything else.
Picking up courage and picking up visual studio, I began to try. I don’t know why I took so long to substitute the entry point in the compilation settings, but as it turned out later, the visual studio compiler does not even throw a warning if main is an array and not a function.
The main list of problems that I had to face:
1) The array is in the data section and cannot be executed
2) Windows does not have syscall and the output must be implemented using printf
Let me explain why the function call is bad here. Usually, the call address is substituted by the compiler from the symbol table, if I'm not mistaken. But we have an ordinary array, where we ourselves must write the address.
The first problem that I encountered, as expected, turned out to be that a simple array is stored in the data section and cannot be executed as code. But after a bit of digging stackoverflow and msdn, I still found a way out. The visual studio compiler supports the section preprocessor directive and you can declare a variable so that it appears in a section with permission to execute.
After checking if this was the case, I was convinced that this works andthe main array function quietly executes opcode ret and does not cause an “Access violation” error.
Now that I could execute the array, I needed to compose the code that would be executed.
I decided that the message "Hello, World" I will store in assembler code. I must say right away that I understand assembler quite poorly, so I ask you not to rush strongly with slippers, but criticism is welcome. This answer to stackoverfow helped me in understanding which assembler code to insert and not call unnecessary functions.
I took notepad ++ and using the plugins-> converter -> "ASCII -> HEX" function got the character code.
Next, we need to split 4 bytes and put it on the stack in the reverse order, not forgetting to turn it over into little-endian.
I slightly missed the point with the way I tried to directly call printf and to save this address later in the array. It turned out for me only having saved the pointer to printf. Later it will be seen why.
We compile and watch the disassembler.
From here we need to take bytes of code.
I thought for a long time how to leave the address from the function table in the finished sequence if only the compiler knows this. And asking some familiar programmers and experimenting, I realized that the address of the called function can be obtained using the operation of taking the address from the variable pointer to the function. Which I did.
As you can see, the pointer contains exactly the same called address. Exactly what is needed.
So, we have a sequence of bytes of assembler code, among which we need to leave an expression that the compiler translates to the address we need to call printf. We have a 4-byte address (because we are writing code for a 32-bit platform), which means that the array must contain 4 byte values, so that after byte FF 15 we have the next element, where we will put our address.
Now we have a sequence of 4 byte numbers and an address to call the printf function, and we can finally populate our main array.
In order to call a break point in the visual studio debugger, you need to replace the first element of the array with 0x646C68 CC.
We start, look.
Done!
I apologize if someone thought the article was “for the smallest”. I tried to describe the process in as much detail as possible and omit the obvious things. I wanted to share my own experience of such a small study. I would be glad if the article would be interesting to someone, and possibly useful.
I’ll leave all the links here:
Article “main usually a function”
Description section on msdn
Some explanation of the assembler code on stackoverflow
And just in case, I’ll leave a link to the 7z archive with the project under visual studio 2013 I
also do not exclude the possibility that the printf call could be further reduced use a different function call code, but I did not manage to investigate this question.
I would be happy for your feedback and comments.
#include<stdio.h>constvoid *ptrprintf = printf;
#pragma section(".exre", execute, read)
__declspec(allocate(".exre")) int main[] =
{
0x646C6890, 0x20680021, 0x68726F57,
0x2C6F6C6C, 0x48000068, 0x24448D65,
0x15FF5002, &ptrprintf, 0xC314C483
};
Foreword
So, I started by finding this article . Inspired by it, I began to think how to do it on windows.
In that article, screen output was implemented using syscall, but in windows we can only use the printf function. Maybe I'm wrong, but I haven’t found anything else.
Picking up courage and picking up visual studio, I began to try. I don’t know why I took so long to substitute the entry point in the compilation settings, but as it turned out later, the visual studio compiler does not even throw a warning if main is an array and not a function.
The main list of problems that I had to face:
1) The array is in the data section and cannot be executed
2) Windows does not have syscall and the output must be implemented using printf
Let me explain why the function call is bad here. Usually, the call address is substituted by the compiler from the symbol table, if I'm not mistaken. But we have an ordinary array, where we ourselves must write the address.
The solution to the problem of "executable data"
The first problem that I encountered, as expected, turned out to be that a simple array is stored in the data section and cannot be executed as code. But after a bit of digging stackoverflow and msdn, I still found a way out. The visual studio compiler supports the section preprocessor directive and you can declare a variable so that it appears in a section with permission to execute.
After checking if this was the case, I was convinced that this works and
#pragma section(".exre", execute, read)
__declspec(allocate(".exre")) char main[] = { 0xC3 };
A bit of assembler
Now that I could execute the array, I needed to compose the code that would be executed.
I decided that the message "Hello, World" I will store in assembler code. I must say right away that I understand assembler quite poorly, so I ask you not to rush strongly with slippers, but criticism is welcome. This answer to stackoverfow helped me in understanding which assembler code to insert and not call unnecessary functions.
I took notepad ++ and using the plugins-> converter -> "ASCII -> HEX" function got the character code.
Hello World!
48656C6C6F2C20576F726C6421
Next, we need to split 4 bytes and put it on the stack in the reverse order, not forgetting to turn it over into little-endian.
Divide, flip.
Add terminal zero to the end.
Divide from the end by 4 byte hex numbers.
Turn in little-endian and reverse the order
48656C6C6F2C20576F726C642100
Divide from the end by 4 byte hex numbers.
00004865 6C6C6F2C 20576F72 6C642100
Turn in little-endian and reverse the order
0x0021646C 0x726F5720 0x2C6F6C6C 0x65480000
I slightly missed the point with the way I tried to directly call printf and to save this address later in the array. It turned out for me only having saved the pointer to printf. Later it will be seen why.
#include<stdio.h>constvoid *ptrprintf = printf;
voidmain(){
__asm {
push 0x0021646C ; "ld!\0"
push 0x726F5720 ; " Wor"
push 0x2C6F6C6C ; "llo,"
push 0x65480000 ; "\0\0He"
lea eax, [esp+2] ; eax -> "Hello, World!"
push eax ; указатель на начало строки пушим на стек
call ptrprintf ; вызываем printf
add esp, 20 ; чистим стек
}
}
We compile and watch the disassembler.
00A8B001 686C 642100push21646Ch
00A8B006 6820576F 72push726F5720h
00A8B00B 686C 6C 6F 2C push2C6F6C6Ch
00A8B0106800004865push65480000h
00A8B015 8D 442402 lea eax,[esp+2]
00A8B019 50push eax
00A8B01A FF 150090 A8 00 call dword ptr [ptrprintf (0A89000h)]
00A8B02083 C4 14 add esp,14h
00A8B023 C3 ret
From here we need to take bytes of code.
In order not to manually remove the assembler code, you can use regular expressions in notepad ++.
Regular expression for sequence after code bytes:
The beginning of lines can be removed using the plugin for notepad ++ TextFx:
TextFX -> "TextFx Tools" -> "Delete Line Numbers or First Word", selecting all the lines.
After which we will have an almost ready-made code sequence for the array.
{2} *. *
The beginning of lines can be removed using the plugin for notepad ++ TextFx:
TextFX -> "TextFx Tools" -> "Delete Line Numbers or First Word", selecting all the lines.
After which we will have an almost ready-made code sequence for the array.
68 6C 64 21 00 68 20 57 6F 72 68 6C 6C 6F 2C 68 00 00 48 65 8D 44 24 02 50 FF 15 00 90 A8 00 ; После FF 15 следующие 4 байта должны быть адресом вызываемой фунцкии 83 C4 14 C3
Calling a function with a “pre-known” address
I thought for a long time how to leave the address from the function table in the finished sequence if only the compiler knows this. And asking some familiar programmers and experimenting, I realized that the address of the called function can be obtained using the operation of taking the address from the variable pointer to the function. Which I did.
#include<stdio.h>constvoid *ptrprintf = printf;
voidmain(){
void *funccall = &ptrprintf;
__asm {
call ptrprintf
}
}
As you can see, the pointer contains exactly the same called address. Exactly what is needed.
Putting it all together
So, we have a sequence of bytes of assembler code, among which we need to leave an expression that the compiler translates to the address we need to call printf. We have a 4-byte address (because we are writing code for a 32-bit platform), which means that the array must contain 4 byte values, so that after byte FF 15 we have the next element, where we will put our address.
Using simple substitutions, we obtain the desired sequence.
Берем полученную ранее последовательность байт нашего ассемблерного кода. Отталкиваясь от того, что 4 байта после FF 15 у нас должны составлять одно значение форматируем под них. А недостающие байты заменим на операцию nop с кодом 0x90.
И опять составим 4 байтные значения в little-endian. Для переноса столбцов очень полезно использовать многострочное выделение в notepad++ с комбинацией alt+shift:
90 68 6C 64 21 00 68 20 57 6F 72 68 6C 6C 6F 2C 68 00 00 48 65 8D 44 24 02 50 FF 15 00 90 A8 00 ; адрес для вызова printf 83 C4 14 C3
И опять составим 4 байтные значения в little-endian. Для переноса столбцов очень полезно использовать многострочное выделение в notepad++ с комбинацией alt+shift:
646C6890 20680021 68726F57 2C6F6C6C 48000068 24448D65 15FF5002 00000000 ; адрес для вызова printf, далее будет заменен на выражение C314C483
Now we have a sequence of 4 byte numbers and an address to call the printf function, and we can finally populate our main array.
#include<stdio.h>constvoid *ptrprintf = printf;
#pragma section(".exre", execute, read)
__declspec(allocate(".exre")) int main[] =
{
0x646C6890, 0x20680021, 0x68726F57,
0x2C6F6C6C, 0x48000068, 0x24448D65,
0x15FF5002, &ptrprintf, 0xC314C483
};
In order to call a break point in the visual studio debugger, you need to replace the first element of the array with 0x646C68 CC.
We start, look.
Done!
Conclusion
I apologize if someone thought the article was “for the smallest”. I tried to describe the process in as much detail as possible and omit the obvious things. I wanted to share my own experience of such a small study. I would be glad if the article would be interesting to someone, and possibly useful.
I’ll leave all the links here:
Article “main usually a function”
Description section on msdn
Some explanation of the assembler code on stackoverflow
And just in case, I’ll leave a link to the 7z archive with the project under visual studio 2013 I
also do not exclude the possibility that the printf call could be further reduced use a different function call code, but I did not manage to investigate this question.
I would be happy for your feedback and comments.