OCaml and PHP - Esoterics for Your Convenience

    NB: reading this topic, it is advisable not to eat - you can choke on surprise.
    NB: less valuable pieces of code had to be put out on the pastebine, due to the fact that the hub cuts the post. Follow the links in the text.

    Many classify OCaml as marginal and even esoteric. Perhaps they are right, although many people disagree with them. My acquaintance with him began six months ago, when I once again wanted to learn something new and I thought that at least one functional language should be mastered. Of the many languages ​​I chose Objective Caml. The language captivated me with human syntax and idea: there are all the functional joys of life, but if you want an imperative style and OOP - take it, I have them! It turned out that the developers were well aware that different tools needed different means. Three days of reading the manual for C ++ and Perl-programmers, and I could read the code and write the wordwords already. This is where my acquaintance with the language ended, because studying the language is not a real task - it’s stupid.

    I returned to OCaml a couple of weeks ago, when one interesting task arose. I looked at it and realized that in principle it can be completely solved in PHP, or write, as I like, a PHP extension to C ++, but a functional language would be much more suitable. So, we have the main program in PHP, from which we want to call the OCaml functions, receiving from this the obvious profit and uncomplicated joy of the idiot and Michurin in one bottle. In this article, I will provide only simple code to demonstrate the principle - without tricky optimizations and all that will make our code much faster, but degrade its readability. In addition, I tried to maximally separate the wrappers for PHP and OCaml, so data conversion is not done directly by PHP ← → OCaml, but along the path PHP ← → C ← → OCaml, which further reduces the speed. But the principle will be much clearer, but with optimization,
    We will define the problem that we solve as follows: it is necessary to take an array of structures from PHP, filter it in some way, for example, by some value, and then return the remaining array.

    OCaml


    Let's get started. To begin with, we will write our OCaml part, which will do all the work. We define in the ocamlpart.ml file the data types with which we will work. Let it be some chemical groups:
    1. type group_position = UndefPos | LeftPos | RightPos;;
    2. type cycle_type = UndefCycle | NoneCycle | AliphaticCycle | AromaticCycle | HeteroCycle;;
    3. type group_type = OrgGroup | InorgGroup | NeighbourGroup of int;;
    4.  
    5. type group = {
    6.         name : string;
    7.         position : group_position;
    8.         cycle : cycle_type;
    9.         grouptype : group_type;
    10.         link : int;  
    11. };;      


    Please note that we have an analogue of the C ( group ) structure , analogues of enum ( group_position , cycle_type ) and a hybrid type, which can be either an enum or an arbitrary number ( group_type ). Now add a function to filter:
    1. let filter_org = List.filter (fun r -> if r.grouptype = OrgGroup thentrueelsefalse)

    This function will filter the list of groups by the value of the grouptype field - in fact, this is an analog of what is called the array_filter function in PHP.
    Finally, add the registration of the filter function for use in C:
    1. let _ =
    2.         Callback.register "filter_organic" filter_org;
    3. ;;

    This block will execute when initializing the OCaml part from C, providing us with a wonderful named callback.
    Now we need to write a wrapper for this.

    C / c ++


    First of all, we will prepare a common cpart.h file, in which there will be common parts for OCaml and PHP wrappers:
    1. #include
    2.  
    3. typedef enum _group_position {
    4.         UndefPos = 0,
    5.         LeftPos,
    6.         RightPos
    7. } group_position;
    8.  
    9. typedef enum _cycle_type {
    10.         UndefCycle = 0,
    11.         NoneCycle,
    12.         AliphaticCycle,
    13.         AromaticCycle,
    14.         HeteroCycle
    15. } cycle_type;
    16.  
    17. typedef enum _group_pos {
    18.         OrgGroup = -2,
    19.         InorgGroup = -1
    20. } group_pos;
    21.  
    22. typedef struct _group {
    23.         char *name;
    24.         group_position position;
    25.         cycle_type cycle;
    26.         int group_type;
    27.         int link;
    28. } group;
    29.  
    30. void init_ocaml ();
    31.  
    32. std::vector filter_org (std::vector g);

    We determined type mappings from OCaml and declared two functions - for initializing OCaml and the main working wrapper function, which will filter the data array.
    In fact, we will have more functions, but only these we plan to show in the future for PHP. Now let's write a ccamlpart.cc file with a wrapper for Ocaml. Headings first:
    1. #include "cpart.h"
    2. #ifdef __cplusplus
    3. extern "C"
    4. {
    5. #endif
    6. #include
    7. #include
    8. #include
    9. #include
    10. #include
    11. #ifdef __cplusplus
    12. }
    13. #endif
    14. #include

    Since we use C ++ with its vector, and we have to compile the code as plus, we do not forget about extern for standard OCaml functions. Now we define auxiliary functions for type conversion between OCaml and C ( here ).
    Here we needed a special function for converting the hybrid OCaml type group_type - its enum values ​​are stored as prime integers, but NeighbourGroup is already a single-field structure.
    Still worth looking at three macros: CAMLparamX, CAMLlocalX and CAMLreturn *. The first of them is used for the correct operation of the garbage collector, and receives all OCaml variables passed to the function as input. The second is used to declare local OCaml variables. And the third macro is again necessary not only for returning the value, but also for the correct operation of the garbage collector.
    And finally, our current functions for PHP:
    1. void init_ocaml ()
    2. {      
    3.         char *argv[1];
    4.         argv[0] = NULL;
    5.         caml_main(argv);
    6. }              
    7.  
    8. std::vector filter_org (std::vector g)
    9. {      
    10.         CAMLparam0();
    11.         static value *closure_f = NULL;
    12.         if (closure_f == NULL)
    13.                 closure_f = caml_named_value("filter_organic");
    14.         CAMLlocal3( cli, cons, cb_res);
    15.         cli = Val_emptylist;
    16.         for (std::vector::iterator i = g.begin(); i != g.end(); ++i)
    17.         {
    18.                 cons = caml_alloc(2, 0);
    19.                 Store_field( cons, 0, camlgroup_of_group(&(*i)));
    20.                 Store_field( cons, 1, cli);
    21.                 cli = cons;
    22.         }      
    23.         cb_res = caml_callback(*closure_f, cli);
    24.         std::vector result;
    25.         while (cb_res != Val_emptylist)
    26.         {
    27.                 result.push_back(group_of_camlgroup(Field(cb_res, 0)));
    28.                 cb_res = Field(cb_res, 1);
    29.         }
    30.         return result;
    31. }

    We construct an OCaml-sheet from a vector in a simple way: it is built literally according to all the canons — a structure in which the first field is an element and the second is a pointer to the continuation of the list. Then we feed this list of OCaml and turn the result back into a vector.

    Put it all together in a static library with makefile:
    1. ocamlpart.o: ocamlpart.ml
    2.         ocamlopt -g -output-obj $^ -o $@
    3.  
    4. ccamlpart.o: ccamlpart.cc
    5.         g++ -g -c -o $@ -I"`ocamlc -where`" $^
    6.  
    7. libchempart.a: ocamlpart.o ccamlpart.o
    8.         ar rcs $@$^
    9.  
    10. all: libchempart.a
    11.  
    12. clean:
    13.         rm -f *.o *.a *.cmi *.cmx


    Php


    Now, finally, let's write a PHP extension. To do this, create a separate php folder in the project folder - the extension will create its own Makefile, overwriting the one we used above, so we separate one from the other. Create the cphppart.h file in it . In the file, we declared the standard module functions, the Chemlib class, and the get_group_of_caml global function. We are again forced to use std :: vector, so we do not forget about extern for PHP inclusions. Now in the cphppart.cc file we directly implement the extension. For starters, standard information about the extension, class and its functions (see here ). Then the module initialization function - it is in it that the class is registered in PHP, and also receives properties in addition to methods:
    1. PHP_MINIT_FUNCTION(chemlib)
    2. {
    3.         init_ocaml();
    4.         zend_class_entry chemlib_ce;
    5.         INIT_CLASS_ENTRY(chemlib_ce, PHP_CHEMLIB_CLASS_NAME, chemlib_class_functions);
    6.         chemlib_class_entry = zend_register_internal_class(&chemlib_ce TSRMLS_CC);
    7.         zend_declare_property_string(chemlib_class_entry, (char*)"name", 4, (char*)"", ZEND_ACC_PUBLIC TSRMLS_CC);
    8.         zend_declare_property_long(chemlib_class_entry, (char*)"position", 8, 0, ZEND_ACC_PUBLIC TSRMLS_CC);
    9.         zend_declare_property_long(chemlib_class_entry, (char*)"cycle", 5, 0, ZEND_ACC_PUBLIC TSRMLS_CC);
    10.         zend_declare_property_long(chemlib_class_entry, (char*)"group_type", 10, 0, ZEND_ACC_PUBLIC TSRMLS_CC);
    11.         zend_declare_property_long(chemlib_class_entry, (char*)"link", 4, 0, ZEND_ACC_PUBLIC TSRMLS_CC);
    12.  
    13.         zend_declare_class_constant_long(chemlib_class_entry, (char*)"UndefPos", 8, 0 TSRMLS_CC);
    14.         zend_declare_class_constant_long(chemlib_class_entry, (char*)"LeftPos", 7, 1 TSRMLS_CC);
    15.         zend_declare_class_constant_long(chemlib_class_entry, (char*)"RightPos", 8, 2 TSRMLS_CC);
    16.  
    17.         zend_declare_class_constant_long(chemlib_class_entry, (char*)"UndefCycle", 10, 0 TSRMLS_CC);
    18.         zend_declare_class_constant_long(chemlib_class_entry, (char*)"NoneCycle", 9, 1 TSRMLS_CC);
    19.         zend_declare_class_constant_long(chemlib_class_entry, (char*)"AliphaticCycle", 14, 2 TSRMLS_CC);
    20.         zend_declare_class_constant_long(chemlib_class_entry, (char*)"AromaticCycle", 13, 3 TSRMLS_CC);
    21.         zend_declare_class_constant_long(chemlib_class_entry, (char*)"HeteroCycle", 11, 4 TSRMLS_CC);
    22.  
    23.         zend_declare_class_constant_long(chemlib_class_entry, (char*)"OrgGroup", 8, -2 TSRMLS_CC);
    24.         zend_declare_class_constant_long(chemlib_class_entry, (char*)"InorgGroup", 10, -1 TSRMLS_CC);
    25. };
    26.  

    As you can see, we set the class all the same properties as in C for the group structure, in addition, we declared convenient constants like Chemlib :: OrgGroup. Add the default functions, the functionality that everyone can fill to taste ( here ). Season with auxiliary internal functions for converting the group between C and PHP:
    1. zval *phpgroup_of_group (group *gr)
    2. {
    3.         zval *res;
    4.         ALLOC_INIT_ZVAL(res);
    5.         object_init_ex(res, chemlib_class_entry);
    6.         zend_update_property_string(Z_OBJCE_P(res), res, (char*)"name", 4, gr->name TSRMLS_CC);
    7.         zend_update_property_long(Z_OBJCE_P(res), res, (char*)"position", 8, gr->position TSRMLS_CC);
    8.         zend_update_property_long(Z_OBJCE_P(res), res, (char*)"cycle", 5, gr->cycle TSRMLS_CC);
    9.         zend_update_property_long(Z_OBJCE_P(res), res, (char*)"group_type", 10, gr->group_type TSRMLS_CC);
    10.         zend_update_property_long(Z_OBJCE_P(res), res, (char*)"link", 4, gr->link TSRMLS_CC);
    11.         return res;
    12.  
    13. }
    14.  
    15. group group_of_phpgroup (zval *gr)
    16. {
    17.         group res,def;
    18.         zval *x = zend_read_property(chemlib_class_entry, gr, (char*)"name", 4, 1 TSRMLS_CC);
    19.         if (Z_TYPE_P(x) != IS_STRING)
    20.                 return def;
    21.         res.name = estrdup(Z_STRVAL_P(x));
    22.         x = zend_read_property(chemlib_class_entry, gr, (char*)"position", 8, 1 TSRMLS_CC);
    23.         if (Z_TYPE_P(x) != IS_LONG)
    24.                 return def;
    25.         res.position = (group_position)Z_LVAL_P(x);
    26.         x = zend_read_property(chemlib_class_entry, gr, (char*)"cycle", 5, 1 TSRMLS_CC);
    27.         if (Z_TYPE_P(x) != IS_LONG)
    28.                 return def;
    29. li style="font-weight: n

    Also popular now: