Zend Cycle Bypass Macros (HashTable Iteration)

Published on June 20, 2016

Zend Cycle Bypass Macros (HashTable Iteration)

  • Tutorial
Continuing his superficial study of the source code for PHP (7.0.7) and writing a simple extensionto him, I would like to go a little deeper this time and describe the methods of traversing the array through the accepted function argument, which I met when implementing a simple PHP median () function. The task of this function is simple - to return the arithmetic mean value. Perhaps this publication will be useful to other PHP developers, like me, who decided in their free time to study a little the architecture of their favorite language, in which they earn money. In a previous publication, I whipped up the technique of quickly creating an extension in PHP with implementations of the factorial calculation function. It is simple to the extent that it takes a simple parameter of an integer type and then is called recursively. The implementation of the median () function is complicated by the fact that the accepted parameter is an array, you need to go through it to sum the total value,

At the moment, I have simplified the task by the fact that I know that all the accepted elements of the array are numbers. The source code for PHP extensions is surprising in that “everything is written” here through the use of macros. At least such an initial opinion is created. It turns out that macros are also used to traverse the list of elements in the array. For clarity, I will immediately give the function code followed by a short description.

The function is described all in the same file - mathstat.c extension mathstat. Link to github.

Listing the mathstat extension functions:

const zend_function_entry mathstat_functions[] = {
        PHP_FE(confirm_mathstat_compiled,       NULL)           /* For testing, remove later. */
        PHP_FE(ms_factorial,    arginfo_ms_factorial)
        PHP_FE(ms_median,       NULL)
        PHP_FE_END      /* Must be the last line in mathstat_functions[] */
};


The very definition of the function body:

PHP_FUNCTION(ms_median)
{
   int argc = ZEND_NUM_ARGS();
   double total = 0;
   int count = 0;
   zval *array,
        *value;
   if (zend_parse_parameters(argc, "a", &array) == FAILURE) {
        RETURN_DOUBLE(0);
   }
   ZEND_HASH_FOREACH_VAL(Z_ARRVAL_P(array), value) {
        total = total + zval_get_double (value);
        count += 1;
   } ZEND_HASH_FOREACH_END();
   if (count == 0 || total == 0) {
        RETURN_DOUBLE(0);
   }
   RETURN_DOUBLE(total/count);
}


If you look at the body of the function, then, like last time, the parameter check function is called, where as a template of the accepted type of the argument we set the value “a” (array)

   if (zend_parse_parameters(argc, "a", &array) == FAILURE) {
        RETURN_DOUBLE(number);
   }


Now the most interesting part is the loop through implemented through the macro ZEND_HASH_FOREACH_VAL. In total, the macros that pass through the array I found in the references 7 pieces. At the same time, the term HashTable is used everywhere instead of an array. For our case, I chose the simplest macro. The first argument he receives the received array through the function, and the second zval (the basic data structure that stores the value and data type - video for this part of Dmitry Stogov). In this case, I just call the zval_get_double function, which, roughly speaking, returns the value from the array to me. If you rewrite this to regular PHP code, you get:

  1 <?php
  2   $array = [1,2,3];
  3
  4   $number = 0;
  5   $count = 0;
  6
  7   foreach($array as $val) {
  8      $number += $val;
  9      $count += 1;
 10   }
 11
 12   echo "cnt: ".$count." total: ".$number."\n";
 13 ?>


That is, in fact, nothing complicated, the same record, only using a macro. If you look at another more advanced macro,

ZEND_HASH_FOREACH_STR_KEY_VAL(ht, key, val)


without code, it’s already clear that this is an analogue of the php loop:

foreach($array as $key => $value) {
} 


For clarity, I will give all the macros from the directory:

ZEND_HASH_FOREACH_VAL(ht, val)
ZEND_HASH_FOREACH_KEY(ht, h, key)
ZEND_HASH_FOREACH_PTR(ht, ptr)
ZEND_HASH_FOREACH_NUM_KEY(ht, h)
ZEND_HASH_FOREACH_STR_KEY(ht, key)
ZEND_HASH_FOREACH_STR_KEY_VAL(ht, key, val)
ZEND_HASH_FOREACH_KEY_VAL(ht, h, key, val)


That's all. Thanks for the time taken and the lost money on mobile traffic.