UniSharping: converting C # code to Java and Python

    Introduction


    Since the 70s Simplified English has been developing , the purpose of which is to define a subset of the language that is understandable to a wide range of non-native speakers. It is recommended, for example, for technical documentation. Automatic translators on such a subset will obviously work more correctly, ideally generating text that does not require manual proofreading.
    If you apply this approach to C # for the task of automatically converting code into other programming languages, you can select a subset of language constructs, system libraries and technologies that can potentially be translated into a wide range of other languages. Moreover, the conversion is not a one-time (migration), but constant for expanding the integration capabilities of the project in C # - so that at any time you can get a working code in another language without the need for any editing.


    Let me introduce: UniSharping


    The limitation of C # .NET for solving this problem was called U # (Universal Sharp), and the conversion process and its tool were called UniSharping . Executables, settings and documentation are laid out on GitHub , the system is free for non-commercial use (Non-Commercial Freeware).


    For cross-platform purposes, Microsoft has already made a .NET Framework limitation in terms of libraries and technologies: .NET Core. It’s like the first step in the right direction, U # takes the second step towards “cross-programmability”.


    The limitations of U # in the constructs of the language turned out to be a bit - these are goto and case goto atavisms, as well as yield, which is not adequately modeled automatically. It is not recommended (although it is possible) to use a struct, there are nuances with names — all this is described in detail in a separate document. The U # parser produces errors and warnings, and to guarantee the correct generation, you should correct the C # source code so that they ideally completely disappear. If you still need to keep the original version, you can use preprocessor directives #if JAVA || PHP ... #else ... #endif. These restrictions operate at the U # engine level and are not subject to correction from the outside, as well as the list of supported languages.


    But restrictions on the level of system libraries are not rigidly defined and are configured externally through special text files that define how to translate this or that class and its members into the appropriate language. If there is a direct analogue, then it is indicated, if the situation is more complicated, then either the code fragment of the target language is written, or in general a special (service) class that solves the necessary problem. In very complicated cases, it is necessary to “hardcode” at the engine level, but such situations are quite rare (about a dozen). The order of tuning on system classes and their members are described in a separate document. Here is a list of supported C # classes and their members with their Java and Python counterparts in the current version on the site, there is also an online demo .


    As for technology, now the list is limited to the console application and unit tests (UnitTest). Well, some Lib-projects, as a special case, are translated into the appropriate constructions of the desired language.


    For a successful translation, the initial C # project (solution) must have some triggered portion that checks the operability within the framework of the original C #. It is good if it is an extensive system of auto-tests (standard UnitTest in different implementations or self-written ones), but at a minimum there should be at least a console application that, when started, without any user intervention, works correctly. The need for this is obvious - after generating the final language, you can immediately check the performance. Ideally, all tests should work similarly to C #.


    Project history


    The idea of ​​such a converter has been around for a long time. My main project SDK Pullenti for natural language processing is an ideal candidate for conversion: a large amount of complex and constantly improved code. For integration with Java, we had to wrap up web services, tcp servers, etc.
    Last summer, there was time and effort to create the first option. He translated the Pullenti project into Java, as well as himself into Java.
    The next six months, the converter developed on several internal projects that were in the company, mainly through the expansion of system classes.
    In the spring of 2018 there was a thought to support and Python, which was implemented by the summer. But the inclusion of a second language was not provided in the initial version and it turned out clumsily. I had to completely redo the engine in the summer for the potential possibility of several end languages. Also, the settings for the system classes from the hardcode were made to external text files. I hope this set will expand not without your help.


    Further plans are as follows:


    • pull up python to java level. Now Python is maintained at the level of Pullenti, but Java has gone far ahead of other projects compared to it.
    • support PHP at least at the level of the project Pullenti.
    • support C ++. Yes, I realize that this is very difficult, since it is not clear when freeing memory - which pointer is a link, and for which you need to do delete. But there are ideas ...

    To whom it may be useful


    Mostly those who develop potentially cross-platform SDKs in C #. Thanks to the UniSharping converter, their SDKs can also become "cross-program", which will expand the circle of potential users.
    Recently, the position of open source software in Russia has increased, which have become mandatory in most government agencies and some large companies. Explain that .NET Core software also does not always work, because "Microsoft". Let some company develop its information system in C #. In order to introduce a product into the "SPO-company", you can select the logical part of the project (back-end), automatically convert it into releases as needed, and make the visual part (front-end) on the SPO. That is, to continue developing in C #, and in Java only the front-end.


    I do not exclude that, in principle, it is possible to convert web-projects (with limitations, of course), but I do not have the necessary skills and information for this. If anyone sees this opportunity, then it is quite possible to implement it in UniSharping.


    I will note that for a real complex C # project, support for Java or another language will require some effort to modify the code, isolate the portable part in the project, and “douse” it with unit tests. Also, setting up still unsupported system classes and methods and fixing the errors of UniSharping itself (with my help) is still the job. But the process is convergent, at the end of which the project expects a cross-program bonus.


    Also popular now: