
Manual cloning of a stream. When Assembler + C # = Love
I’ll go straight to the point. Task: at any point in the code by calling spec. method to create a second thread that starts execution from the point of invocation of this method in the parent thread, while maintaining the ability to debug and the value of all local variables at all levels of method calls.
The implementation is independent of the final platform (.Net / Java), as written in C ++ / Asm, however user code is made in C #, as I write on it.

Now that I have finally stabilized the example for 32-bit systems, having the courage, I am ready to show it to the public as completely ready. And, yes, I repeat: during adaptation it will work on any platform

The aim of the work is to build functionality associated with threads, which is not provided by the operating system. For example, we used the Linux Fork () method, which was fixed for Windows OS realities.
So, if we have the Original method, inside which the Fork.CloneThread () method is called in some part of it, a second thread of execution should occur, the beginning of which will be equal to the point of call of the Fork.CloneThread () method and the execution of which will be finished when the method exits Original in such a way that all values of local variables of the original thread are saved in the second execution thread. In other words, so that the call to CloneThread () splits the current thread into two.
Materials for preparation:
What do we have initially? There is our stream. It is also possible to create a new thread or schedule a task in a thread pool by executing your code there. We also understand that information on nested calls is stored in the call stack and that, if desired, we can manipulate it (for example, using C ++ / CLI). Moreover, if you follow the conventions and enter the value of the EBP register at the top of the stack, return address for ret instruction and allocate space for local ones (if necessary), this way you can simulate a method call.
What needs to be done to clone a stream?
The most important thing for which this is done is to strengthen the understanding of how everything works and what if you know, you can begin to manipulate it.
The implementation is independent of the final platform (.Net / Java), as written in C ++ / Asm, however user code is made in C #, as I write on it.

Now that I have finally stabilized the example for 32-bit systems, having the courage, I am ready to show it to the public as completely ready. And, yes, I repeat: during adaptation it will work on any platform

For a start, a full list of articles posted on Habré for this cycle
Making shipped assemblies: interacting between domains without marshalling.
Obtaining a .Net object pointer.
Manual cloning of a stream. When Assembler + C # or Java = Love
Changing the code of system assemblies or “leak” .Net Framework 5.0
How decompilation works in .Net or Java using .Net as an example
Continuing to shred CLR: pool of .Net objects outside heaps SOH / LOH
We remove objects dump from memory .Net applications







Goals
The aim of the work is to build functionality associated with threads, which is not provided by the operating system. For example, we used the Linux Fork () method, which was fixed for Windows OS realities.
So, if we have the Original method, inside which the Fork.CloneThread () method is called in some part of it, a second thread of execution should occur, the beginning of which will be equal to the point of call of the Fork.CloneThread () method and the execution of which will be finished when the method exits Original in such a way that all values of local variables of the original thread are saved in the second execution thread. In other words, so that the call to CloneThread () splits the current thread into two.
What is required from the reader
- Lack of fear to read assembler. It's just =) Where something is not clear, use google
- Understanding that a thread stack is one per thread. Understanding what it is for
Materials for preparation:
Flow cloning
What do we have initially? There is our stream. It is also possible to create a new thread or schedule a task in a thread pool by executing your code there. We also understand that information on nested calls is stored in the call stack and that, if desired, we can manipulate it (for example, using C ++ / CLI). Moreover, if you follow the conventions and enter the value of the EBP register at the top of the stack, return address for ret instruction and allocate space for local ones (if necessary), this way you can simulate a method call.
What needs to be done to clone a stream?
- Preservation
- Inside the CloneThread (C #) method, we get the address of any local variable
- We make a call to the C ++ method, passing it this address. At this stage, the call stack looks like this:
Well, or in an abbreviated manner, like this: - Inside - we get the value of the EBP - pointer to our call frame and along the chain, dereferencing the pointer, we go to the CloneThread method, checking the current EBP with the address of the local variable in CloneThread. This is necessary in order to get through all the proxy calls between C # and C ++ that are generated by JITter.
- We add 1 to exit the CloneThread frame and get into the code that calls our library function. Everything from the received address to ESP is a chain of calls from user code. We save it to the buffer, create a stream (or take it from the pool) and pass it the address of this buffer - a copy of the stack.
- Recovery . In order for the new thread to continue working from the copy point in the parent, it is necessary to simulate the CloneThread () call from the user method that was called in the new thread (which no one actually called). To do this, we need to add to the top of our call stack the saved piece of the parent thread stack, fix the EBP chain that forms the stack of frames and run the code.
- Initially, when our code just started working in the second thread, we have this kind of call stacks:
- We get the ESP address.
- Push the address of the body of the current method onto the stack - to return from the user method, the simulation of which will be called
- Push EBP - to maintain the integrity of the frames. Together with a copy of the stack on the heap, we have the following kind of call stacks:
- We fix the saved EBP chains in the stack copy (you cannot do it in place) before copying
- With push commands, we insert a copy of the stack of the parent thread into the current stack (we simulate a call to the user method that called CloneThread, which called many proxies and the C ++ method as a result)
- We make the distant JMP in C ++ the CloneThread method, in which we provide the return run
- This results in an exit to CloneThread (C #), which exits to the user method
- Voila - in both threads the code is executed from the same point. The branching of the stream is completed.
- Initially, when our code just started working in the second thread, we have this kind of call stacks:
Why do it
The most important thing for which this is done is to strengthen the understanding of how everything works and what if you know, you can begin to manipulate it.
Resources
- GitHub project DotNetEx : the project in it is called AdvancedThreadingLibrary, to run use RocketScience / 01-forkingThread. By the way , in the same library there are examples with sizeof (ReferenceType), IoC with a shipped assembly and an object pool in its heap.