CXXI: Bridge between the C # and C ++ worlds

Original author: Miguel de Icaza
  • Transfer
The Mono runtime has a lot of tools for interacting with code in non-.NET languages, but there has never been anything sane for interacting with code in C ++.

But this will soon change thanks to the work of Alex Corrado, Andrea Gait and Zoltan Varg.

In short, the new technology allows C # /. NET developers:

  • It is easy and transparent to use C ++ classes from C # or any other .NET language
  • Create instances of C ++ classes from C #
  • Call C ++ Class Methods from C # Code
  • Call inline C ++ methods from C # code (provided that the library is compiled with the -fkeep-inline-functions flag or if you compile an additional library with their implementations)
  • Inherit C ++ classes from C #
  • Override C ++ Class Virtual Methods in C # Methods
  • Use instances of such mixed C ++ / C # classes in both C # and C ++ code

CXXI ( approx. Transl .: reads “sexy”) is the result of two months of work under the auspices of Google's Summer of Code in order to improve the interaction of Mono with C ++ code.

Alternatives


I remind you that Mono provides several mechanisms for interacting with code in non-.NET languages, mostly inherited from the ECMA standard. These mechanisms include:
  1. Bilateral Platform Invoke (P / Invoke) technology, which allows managed code (C #) to call functions from unmanaged libraries, and to make the code of these libraries callbacks back to managed code.
  2. COM Interop allows code running in Mono to transparently call unmanaged C or C ++ code as long as this code complies with some COM conventions (these conventions are pretty simple: standard vtable markup, implementation of Add, Release, and QueryInterface methods, as well as using standard set of types that can be marshalled between Mono and the COM library).
  3. A common technology for intercepting calls , which allows you to intercept a call to a method of an object and further independently understand what to do with it.


But when it comes to using C ++ objects in C #, the choice is not very encouraging. For example, suppose you want to use the following C ++ class from C #:

class MessageLogger {
public:
	MessageLogger (const char *domain);
	void LogMessage (const char *msg);
}


One way to provide this class to C # code is to wrap it in a COM object. This might work for some higher-level objects, but the wrapping process is very tedious and routine. You can see how this uninteresting activity looks like here .

Another option is to rivet adapters, which can then be called up via P / Invoke. For the class presented above, they will look something like this:

/* bridge.cpp, компилируется в bridge.so */
MessageLogger *Construct_MessageLogger (const char *msg)
{
	return new MessageLogger (msg);
}
void LogMessage (MessageLogger *logger, const char *msg)
{
	logger->LogMessage (msg);
}

The part in C # looks like this:
class MessageLogger {
	IntPtr handle;
	[DllImport ("bridge")]
	extern static IntPtr Construct_MessageLogger (string msg);
	public MessageLogger (string msg)
	{
		handle = Construct_MessageLogger (msg);
	}
	[DllImport ("bridge")]
	extern static void LogMessage (IntPtr handle, string msg);
	public void LogMessage (string msg)
	{
		LogMessage (handle, msg);
	}
}

Sit for half an hour at compiling such wrappers and want to kill the author of the library, the compiler, the creators of C ++, C #, and then completely destroy this mortal and imperfect world.

Our PhyreEngine # was the .NET C ++ API bindings for Sony's PhyreEngine. The process of writing the code was very tedious, so on the knee we did something like a code generator.

In addition, the above methods do not allow you to override C ++ class methods with C # code. More precisely, you can do this, but this will require writing a lot of code manually, taking into account a bunch of special cases and a lot of callback calls. Bindings very quickly become practically unsupported (we came across this ourselves while doing binders to PhyreEngine).

The ordeals described above prompted the creation of CXXI.

How does it work


Access to C ++ classes is a complex of problems. I will briefly describe the features of the implementation of C ++ code that play a large role for CXXI:
  • Object layout: a binary representation of an object in memory, may vary on different platforms.
  • VTable markup: The list of pointers to implementations of virtual methods used by the compiler to determine the address of a method depends on the virtual methods of the class and its parents.
  • Decorated Names : Non-virtual methods not included in vtable. The compiler generates the usual "C" functions, whose name is calculated based on the type of the return value and the types of arguments. Decoration scheme depends on the compiler.


For example, we have this class:

class Widget {
public:
	void SetVisible (bool visible);
	virtual void Layout ();
	virtual void Draw ();
};
class Label : public Widget {
public:
	void SetText (const char *text);
	const char *GetText ();
};


The C ++ compiler for these methods of methods will generate the following names ( approx.per .: meaning compilers like GCC and Intel C ++ Compiler for Linux, the studio will produce something unreadable like? H @@ YAXH @ Z; in the case of GCC, you can use the c + utility + filt): Here is a code
__ZN6Widget10SetVisibleEb
__ZN6Widget6LayoutEv
__ZN6Widget4DrawEv
__ZN5Label7SetTextEPKc
__ZN5Label7GetTextEv




	Label *l = new Label ();
	l->SetText ("foo");
	l->Draw ();	


It will be compiled into something similar to this (represented as C code):

	Label *l = (Label *) malloc (sizeof (Label));
	ZN5LabelC1Ev (l);   // Декорированное имя конструктора Label
	_ZN5Label7SetTextEPKc (l, "foo");
	// Эта строка вызывает Draw
	(l->vtable [METHOD_PTR_SIZE*2])();


In order for CXXI ​​to support such things, it needs to know the exact location of the methods in the vtable, know where and how each of the methods is implemented, and know how to reach them by the decorated name.

The diagram below shows how a C ++ library becomes available to C # and other .NET languages.


In fact, your C ++ code compiles twice. The C ++ compiler generates an unmanaged library for you, and the CXXI ​​toolkit generates binders.

Generally speaking, CXXI ​​only needs header files from your C ++ code, and only those that you need to wrap for use in C #. So if you only have a proprietary library and header files for it, CXXI ​​will still be able to generate binders.

The CXXI ​​toolkit creates a regular .NET library (approx. transl.: it’s exactly the Dotnet library containing MSIL and nothing but - no unmanaged code) that you can safely use from C # and other .NET languages. This library exposes C # classes with the following properties:

  • When you create an instance of the C # class, its constructor creates an instance of the corresponding C ++ class.
  • These classes can be basic for other C # classes, all methods marked as virtual can be overridden by C # code.
  • Multiple inheritance of C ++ classes is supported: the generated C # class implements a set of type conversion operators, allowing you to get to the various C ++ base classes.
  • Overridden methods can use the C # base keyword to invoke C ++ base class methods.
  • You can override any virtual class methods, including in the case of multiple inheritance.
  • There is also a constructor that accepts IntPtr, in case you want to use an instance of the C ++ class already created by someone else.


The CXXI ​​pipeline consists of three components, shown in the diagram to the right.

The GCC-XML compiler is used to parse your C ++ code and extract the necessary information from it. The generated XML is then processed by CXXI ​​utilities to generate a set of partial classes in C # containing actual bridges to classes in C ++.

Then this is combined with any additional code that you want to add (for example, several overloaded methods to improve the API, implementation of ToString, Async methods , etc).

The output is a .NET assembly that works with the native library.

It is worth noting that this assembly does not contain the map for marking objects in memory. Instead, the CXXI ​​binder determines this based on the conversion rules used at the time the ABI was executed and the corresponding conversion rules. Thus, you need to compile binders only once, and then calmly use them on different platforms.

Examples


GitHub project code contains various tests and a bunch of examples. One of them is the minimal bindings to Qt.

What else is left to implement


Unfortunately, the CXXI ​​project is not finished yet, but this is already a good start for a tangible improvement in the interaction of code in .NET and C ++.

Currently CXXI ​​does all the work in runtime, generating adapters through System.Reflection.Emit as necessary, which allows you to dynamically determine the ABI used by the C ++ library compiler.

We are also going to add support for static compilation, which will allow using this technology with C # writers for PS3 and iPhone.

CXXI currently supports ABI GCC and has initial support for ABI MSVC. We will be happy to help with the implementation of ABI support for other compilers and with the completion of MSVC support.

Currently, CXXI ​​only supports deleting objects created by it. All other objects are considered to belong to the world of unmanaged code. Support for the delete operator for such objects would also be useful.

We also want to better document the pipeline and the runtime API, as well as improve the binders themselves.

From translator


This method compares favorably with writing tons of glue-code in C ++ / CLI, here all the work is done for you, and even everything turns out cross-platform. It is also worth noting that an article about a similar way of kicking class methods in C ++ appeared on the hub in that year , though a lot was done there manually. However, according to the author, the use of on-the-fly wrappers turned out to be one and a half times faster than COM Interop (on runtime from MS).
Oh yes. This is not reflected in the article, but judging by the test cases on the github, you can refer to the fields of C ++ objects.
How usable is it? Theoretically, you can right now take any plus lib and generate binders for it (in the case of Windows, you will need to compile it in Cygwin). And it will work fine if it does not have methods that return freshly created instances of objects, because at the moment they cannot be deleted, however in Qt QObject has a deleteLater () slot, so there should be no problems. In practice, the generic one fell while trying to generate binders for Irrlicht, and GCCXML fell on OGRE without mastering something from std :: tr1. Generally speaking, it would be worth giving up GCCXML in favor of clang, since it is very rarely updated by GCCXML, but it works, as it turned out, crookedly. But in the examples there are working binders for some QtGui classes (incomplete, nobody has done QObject infrastructure with all meta-information and signal slots so far).

Also popular now: