asArtem April 10, 2010 at 10:36

Features of the CLR in the .NET framework

Starting to learn the C # language and .NEt Framework, I couldn’t understand how the CLR works. I either found huge articles that could not be mastered in 1 evening, or too short, rather confusing description of the process (as in G. Schildt’s book).
Some time ago, I decided that it would be nice to collect knowledge from books, “features” and commonly used techniques in one place. And then new information quickly settles in the head, but it is also quickly forgotten and after a few weeks you have to rummage through hundreds and thousands of lines of text again to find the answer to the question. Reading the next book on programming, I made brief notes of the most important thing that seemed to me. Sometimes he described a process in a language that I understood with an invented example, etc.I do not pretend to be absolutely correct in the material presented. This is just my understanding of the process, with my examples and information that I considered key to understanding. Having worked on some material, I decided to keep it for all those who might find it useful. And whoever is already familiar will simply refresh this in memory.

It should be noted that the concept of “type” is a kind of class in C #. But since Since .NET supports not only C # but also other languages, the concept of “type” is used, and not the usual “class”. Also, this article assumes that the reader is already familiar with the features of .Net and reveals the features of specific things and processes.

As an example, I will give the program text that displays the age of the object: the
source code of the program, so that it is clear:
using System; namespace ConsoleApplication_Test_Csharp { public class SomeClass { int age; public int GetAge() { age = 22; return age; } } public sealed class Program { public static void Main() { System.Console.Write("My age is "); SomeClass me = new SomeClass(); int myAge; myAge = me.GetAge(); System.Console.WriteLine(myAge); Console.ReadLine(); } } } * This source code was highlighted with Source Code Highlighter.

And so let's get started:

What is a CLR?

CLR (Common language runtime) is a common language runtime. It provides integration of languages and allows objects, thanks to a standard set of types and metadata) created in one language, to be “equal citizens” of code written in another.

In other words, CLR is the very mechanism that allows a program to execute in the order we need, calling functions, managing data. And all this for different languages (c #, VisualBasic, Fortran). Yes, the CLR really controls the process of executing commands (machine code, if you want) and decides which piece of code (function) from where to get and where to substitute right at the moment the program runs. The compilation process is shown in the figure: IL

(Intermediate Language) - code in a special language, reminiscent of assembler, but written for .NET. Code from other top-level languages (c #, VisualBasic) is converted to it. That's when the dependence on the chosen language disappears. After all, everything is converted to IL (although there are reservations regarding the conformity of the common CLS language specification, which is not included in the scope of this article)
Here is what it looks like for the SomeClass :: GetAge () function The compiler, in addition to the IL assembler, creates complete metadata. Metadata

- a set of data tables describing what is defined in the module. There are also tables indicating what the managed module refers to (for example, imported types and numbers). They expand the capabilities of technologies such as type libraries and interface description language (IDL) files. Metadata is always associated with a file with IL code; in fact, it is embedded in * .exe or * .dll.
Thus, metadata is a table in which there are fields that indicate that such and such a method is in such and such a file and belongs to such and such type (class).
Here is the metadata for my example (metadata tables are simply converted using the ILdasm.exe disassembler. Actually, this is part of the * .exe program file: TypeDef is an entry for each type defined in the module

For example, TypeDef # 1 describes the SomeClass class and displays a Field # 1 field with the name Field Name: age, MethodName: GetAge, and MethodName: .ctor constructor. The TypeDef # 2 entry describes the Program class.

Having figured out the basic concepts, let's see what this managed module consists of (or just our ConsoleApplication_Test_Csharp.exe file, which displays the age of the object on the screen): The

header shows what type of processor the program will run on. PE32 (for 32 and 64 bit OS) or PE32 + (only for 64 bit OS) CLR
header - contains information that turns this module into manageable (flags, CLR version, entry points in Main ())
Metadata - 2 types of metadata tables:
1) types and members defined in the source code;
2) types and members referenced in the source code.
IL Code - Code generated by the compiler when compiling C # code. Then IL is converted to processor instructions (0001 0011 1101 ...) using the CLR (or rather JIT)

JIT work

And so, what happens when a program first starts ?
First, the header is analyzed to find out which process to start (32 or 64 bit). Then the selected version of the MSCorEE.dll file is loaded ( C: \ Windows \ System32 \ MSCorEE.dll for 32-bit processors).
Then the method located MSCorEE.dll is called, which initializes the CLR, assemblies, and the entry point of the Main () function of our program.

static void Main() { System.Console.WriteLine("Hello "); System.Console.WriteLine("Goodbye"); } * This source code was highlighted with Source Code Highlighter.

To execute a method, for example, System.Console.WriteLine ("Hello"), IL must be converted to machine instructions (the same zeros and ones). Jiter or just-in-time compiler does this.

First, before executing Main (), the CLR finds all declared types (for example, the Console type).
Then it defines the methods, combining them into records inside a single “structure” (according to one method defined in the Console type).
Records contain the addresses by which method implementations can be found (i.e., those conversions that the method performs). At the first call to the WriteLine function, the JiT-compiler is called. JiTer 'have known the called method and the type by which this method is defined.

JiTer searches in the metadata of the corresponding assembly for the implementation of the method code (the implementation code for the WriteLine method (string str)).
Then, it checks and compiles IL into machine code (native instructions), storing it in dynamic memory.
After the JIT, the Compiler returns to the internal "structure" of data of the type (Console) and replaces the address of the called method with the address of the memory block with executable processor instructions.
After that, the Main () method calls the WriteLine (string str) method again. Because the code has already been compiled, the call is bypassing the JiT Compiler. By executing the WriteLine (string str) method, control is returned to the Main () method.

From the description it follows that the function “works slowly” only at the time of the first call, when the JIT translates the IL code into processor instructions. In all other cases, the code is already in memory and substituted as optimized for this processor. However, if another program is launched in another process, Jiter will be called again for the same method. For applications running in x86 JIT, 32-bit instructions are generated; in x64 or IA64, 64-bit are generated.

Code optimization. Managed and Unmanaged Code

IL can be optimized, i.e. IL - NOP commands (empty command) will be removed from it. To do this, when compiling, you need to add parameters.

Debug version is going with parameters: / optimize -, / debug: full
Release version is going with parameters: / optimize +, / debug: pdbonly

What is the difference between managed code and unmanaged?

Unmanaged code is compiled for a particular processor and simply executed when called.

In a managed environment, compilation is performed in 2 stages:

1) the compiler translates C # code to IL
2) for execution, you need to translate the IL code into the machine code of the processor, which requires additional. dynamic memory and time (just the same JIT job).

Interaction with unmanaged code:

- managed code can call a guided function from a DLL using P / Invoke (for example, CreateSemaphore from Kernel32.dll).
- managed code can use an existing COM component (server).
- unmanaged code can use a managed type (server). You can implement COM components in a managed environment and then you do not need to count interface references.

The / clr option allows you to compile Visual C ++ code into IL-managed methods (except when it contains commands with assembler inserts (__asm), a variable number of arguments, or built-in procedures (__enable, _RetrurAddress)). If this does not work out, then the code will compile into standard x86 commands. Data in the case of IL code is not managed (metadata is not generated) and is not tracked by the garbage collector (this applies to C ++ code).

Type system

In addition, I want to talk about the CTS type system adopted by Microsoft.

CTS (Common Type System) is a common type system in the CLR (a type, apparently, is an analogue of the C # class). This is an ECMA recognized standard that describes type definitions and their behavior. It also defines the rules of inheritance, virtual methods, and the lifetime of objects. After ECMA registration, the standard was called CLI (Common Language Infrastructure)

- CTS only supports single inheritance (unlike C ++)
- All types are inherited from System.Object (Object - type name, root of all other types, System - namespace)

According to the CTS specification, any type contains 0 or more members.

Key members:

Field - a variable, part of the state of the object. Identified by name and type.
Method - a function that performs an action on an object. It has a name, signature (number of parameters, sequence, parameter types, return value of the function) and modifiers.
Property - in the implementation it looks like a method (get / set) and for the caller as a field (=). Properties allow the type in which they are implemented to check the input parameters and the state of the object.
Event - provides a mechanism for mutual notification of objects.

Access modifiers:

Public - the method is accessible to any code from any assembly
Private - methods are called only inside the
Family (protected) type - the method is called by derived types regardless of the assembly
Assembly (internal) - the method is called by any code from the same assembly
Family or Assembly
(protected internal) - the method is called by derived types from any assembly and + by any types from the same assembly.

CLS (Common Language Specification) is a specification released by Microsoft. It describes the minimum set of features that compiler manufacturers must implement in order for their products to work in the CLR. CLR / CTS supports more features defined by CLS. IL assembler supports the full range of CLR / CTS functions. Languages (C #, Visual Basic) supports part of the CLR / CTS features (including the minimum from CLS).
Example in Figure Example CLS Compliance Check

The [assembly: CLSCompliant (true)] attribute causes the compiler to detect any externally accessible types that contain constructs that are not valid in other languages.

using System; [assembly: CLSCompliant(true)] namespace SomeLibrary { // возникает предупреждение поскольку тип открытый public sealed class SomeLibraryType { // тип, возвращаемый функцией не соответсвует CLS public UInt32 Abc() { return 0; } // идентификатор abc() отличается от предыдущего, только если // не выдерживается соответсвие public void abc() { } // ошибки нет, метод закрытый private UInt32 ABC() { return 0; } } } * This source code was highlighted with Source Code Highlighter.

First warning: UInt32 Abc () returns an unsigned integer. Visaul Basic, for example, does not work with such values.
The second warning: the two public methods Abc () and abc () are lonely and differ only in letter case and return type. VisualBasic cannot call both methods.

By removing public and leaving only the sealed class SomeLibraryType, both warnings will disappear. Since SomeLibraryType will be internal by default and will not be visible from outside the assembly.

PS The article is based on materials from J. Richter’s book “CLR via C #. Programming on the Microsoft .NET Framework 2.0 in C # »

Tags:

Features of the CLR in the .NET framework

What is a CLR?

JIT work

Code optimization. Managed and Unmanaged Code

Type system

Also popular now: