CPrompt - C language interpreter
Since June 2009, I have been developing the C interpreter. (I already mentioned this in the article on function calls ).
Quite a lot of constructions have already been implemented: loops, selection, expression evaluation, function calls (both declared by the user and standard), inclusions, and more.
The program received the name CPrompt by analogy with the “prompt” command line prompt (D:>, root @ comp #, ...).
Starting to interpret the source file is simple:
$ cprompt /path/to/file.c The
interpretation is divided into 3 stages:
1) Preprocessing
2) Building the execution tree
3) Call main ()
The first stage, preprocessing, important and boring: it processes inclusions, defines, comments, and more.
The output is clean code, without directives and extra text.
The second stage is the construction of the execution tree. Here, the code is taken and a graph (tree) is built, the root of which is a certain APPLICATION object, which contains information about the file being launched. In subtrees - functions, each token goes into an element of the tree. Each element of the graph has its own "type" - a number that shows its purpose.
Functions are divided into 2 types - declared by the user and "external". The latter - for example, standard ones from libc, from other shared libraries.
The output is a tree similar to this:
...
Tree was built:
t0; APPLICATION ;;;
| t0; FUNCTIONS ;;;
| | t14; double; cos ;;
| | t14; double; floor ;;
| | t14; int; isdigit ;;
...
| | t2; double; round ;;
| | | t5; floor (value + 0.5) ;;;
| | t14; int; printf ;;
| | t2; int; main ;;
| | | t1; int; l; round (7.2);
...
(this is part of the log that the interpreter displays).
t0, t2, t14 - type of elements.
0 - no type
1 - expression (assignment is considered one of the operations, along with +, -, * ... but in a different priority).
2 - function
5 - “return”
14 - “external function”,
And others, for different actions.
Arguments to functions are stored in structures, so they are not in the visible part of the tree. In fact, pointers to them are registered in each function branch.
All program execution is logged in great detail - when you specify the --dbg parameter, a lot of information is output to standard output.
If you look at the log, you will notice that the execution is divided into 5 points, and not 3 as I said at the beginning. Two points - before and after preprocessing - is parsing the source text. Parsing before processing - and parsing processed text.
The only way beyond the standard that I allowed myself was to add the “outside” construct, which, due to the impossibility of static linking, allows you to declare functions from external libraries. Now it is possible to declare only those functions with the libraries of which the interpreter was built, but in the future it will be possible to import from shared libraries.
For example,
outside cdecl: double cos (double x);
outside cdecl: double floor (double x);
Where cdecl is the function call convention. More about this function call .
A product is written in C ++.
View or download the source code (Google Code) .
I hope that a working release of the product will be available soon, which can be used, as every day I am getting closer to the standard. The ultimate goal is full support for the C99.
On such a code, the log will be like this .
A lot of superfluous, but you can make out if you wish.
And what are your options for using the si interpreter? Where could this product be applied?
UPD transferred to the programming languages
UPD2 Added a make-file to the repository.
To compile: and run ./cprompt / path / to / file
Quite a lot of constructions have already been implemented: loops, selection, expression evaluation, function calls (both declared by the user and standard), inclusions, and more.
The program received the name CPrompt by analogy with the “prompt” command line prompt (D:>, root @ comp #, ...).
A little about the work of the interpreter
Starting to interpret the source file is simple:
$ cprompt /path/to/file.c The
interpretation is divided into 3 stages:
1) Preprocessing
2) Building the execution tree
3) Call main ()
The first stage, preprocessing, important and boring: it processes inclusions, defines, comments, and more.
The output is clean code, without directives and extra text.
The second stage is the construction of the execution tree. Here, the code is taken and a graph (tree) is built, the root of which is a certain APPLICATION object, which contains information about the file being launched. In subtrees - functions, each token goes into an element of the tree. Each element of the graph has its own "type" - a number that shows its purpose.
Functions are divided into 2 types - declared by the user and "external". The latter - for example, standard ones from libc, from other shared libraries.
The output is a tree similar to this:
...
Tree was built:
t0; APPLICATION ;;;
| t0; FUNCTIONS ;;;
| | t14; double; cos ;;
| | t14; double; floor ;;
| | t14; int; isdigit ;;
...
| | t2; double; round ;;
| | | t5; floor (value + 0.5) ;;;
| | t14; int; printf ;;
| | t2; int; main ;;
| | | t1; int; l; round (7.2);
...
(this is part of the log that the interpreter displays).
t0, t2, t14 - type of elements.
0 - no type
1 - expression (assignment is considered one of the operations, along with +, -, * ... but in a different priority).
2 - function
5 - “return”
14 - “external function”,
And others, for different actions.
Arguments to functions are stored in structures, so they are not in the visible part of the tree. In fact, pointers to them are registered in each function branch.
All program execution is logged in great detail - when you specify the --dbg parameter, a lot of information is output to standard output.
If you look at the log, you will notice that the execution is divided into 5 points, and not 3 as I said at the beginning. Two points - before and after preprocessing - is parsing the source text. Parsing before processing - and parsing processed text.
The only way beyond the standard that I allowed myself was to add the “outside” construct, which, due to the impossibility of static linking, allows you to declare functions from external libraries. Now it is possible to declare only those functions with the libraries of which the interpreter was built, but in the future it will be possible to import from shared libraries.
For example,
outside cdecl: double cos (double x);
outside cdecl: double floor (double x);
Where cdecl is the function call convention. More about this function call .
A product is written in C ++.
View or download the source code (Google Code) .
I hope that a working release of the product will be available soon, which can be used, as every day I am getting closer to the standard. The ultimate goal is full support for the C99.
Example
#include
int main(int argv,char* argc[])
{
int l=round(7.2);
}On such a code, the log will be like this .
A lot of superfluous, but you can make out if you wish.
And what are your options for using the si interpreter? Where could this product be applied?
UPD transferred to the programming languages
UPD2 Added a make-file to the repository.
To compile: and run ./cprompt / path / to / file
svn checkout cprompt.googlecode.com/svn/trunk/ cprompt
cd cprompt
make