Generating a list of IPv4 addresses on TCL and few number systems

    Not so long ago, it was required to solve the problem of mass updating the configuration of devices. A standard system administration task if you have more than one device performing the same functions in service. There are universal products for the solution, for example, from the available redmine.nocproject.org , as well as many scripts widely presented on thematic forums and portals. Just for this case, your own written script should have been at hand, but it didn’t turn out, so given that there was time for maneuvers, the script was re-written, executed and put on a shelf so that it would get lost again. Exp
    was used for writing - expect.sourceforge.net, an add-on for TCL that allows you to process and respond to the responses of various interactive console utilities, in particular, telnet. Given that TCL did not have to be written before, the code needed to be rethought. The key moment of the script is the generator of the list of IPv4 addresses for processing, after a careful evaluation, this piece of the program was able to significantly, in my opinion, optimize, at least reduce the number of lines by a third and painlessly add new functionality. Moreover, all these reductions had little relation to the specifics of TCL, but concerned fundamental approaches to the construction of the algorithm as a whole.
    I have allocated this code as a separate utility, which I will try to analyze in great detail later on in the text - how it was “before” and what became “after”, and why it did not work out to write right away as “after”. I still don’t like everything about it: both algorithmic problems and TCL problems confuse, for example, using lists instead of arrays (which is faster ?, safer?, Ideologically more accurate?), All doubts are also present in the text, with hope for constructive comments.
    The logic of the utility (cipl.tl) is as follows - on the command line we set two parameters: IPv4 address from which we will begin to build our list and IPv4 address by which the list ends or a number indicating how many elements should be in the list. The order of construction is from a lower address (first parameter) to a larger one. If the second parameter is omitted, a list consisting only of the starting IPv4 address is displayed:
    > cipl.tl 192.0.2.1 1
    192.0.2.1
    192.0.2.2
    > cipl.tl 192.0.2.1 192.0.2.2
    192.0.2.1
    192.0.2.2
    > cipl.tl 192.0.2.1
    192.0.2.1
    

    For Windows, the script is launched along with the tclsh interpreter; in fact, it can also be done in * nix
    > tclsh cipl.tl 192.0.2.1
    192.0.2.1
    

    Further I will quote the code, providing it with line numbers and versions, and then comment on it. The resulting version, you can pick up the links at the end of the topic.

    Ver. 1, No. 1-12
    #!/usr/bin/tclsh8.5
    set exip {^(2(5[0-5]|[0-4]\d)|1\d{2}|[1-9]\d{0,1})(\.(2(5[0-5]|[0-4]\d)|1\d{2}|[1-9]\d|\d)){3}$}
    set exdg {^(0?|([1-9]\d{0,5}))$}
    if {$argc == 0 || $argc > 2} then {
            puts "cipl.tl - Выводит список IP адресов начиная с заданного"
            puts "Использование: cipl.tl  \[|\]"
    	puts "Аргументы:   - IPv4 адрес меньший либо равный "
    	puts "\t    - IPv4 адрес"
    	puts "\t        - число от 0 до 999999"
    } else {	
    

    The first line shows which interpreter we need to use, this is the line for Linux. In general, you need to specify the full path to tclsh, for example, for FreeBSD, this line will look like this:
    #!/usr/local/bin/tclsh8.5

    Next we set the exip and exdg variableswhich are regular expressions that we will use later in the program. We need the first variable to verify that the IPv4 address is entered correctly. Correct addresses written in decimal form from 1.0.0.0 to 255.255.255.255 fall under this regular expression, that is, it will not work to set an address of the form 192.0.02.010. The second variable determines the number, without leading zeros - the valid boundaries of the list are from 0 to 999999, an empty string is also true. The restriction from above 999999, in my opinion, was reasonable and, in addition, I did not waste time looking for a regular expression corresponding to the number 2 in degree 32. These regular expressions did not appear immediately, but were added based on the needs of the solution, which explains why they such, but it will be seen a little further.
    Next, the condition is checked.if - the number of parameters passed from the command line, if there are 0 or more than 2 then a small help is displayed. At this point, you don’t need to output anything, but simply take the first two parameters as the desired ones, thereby concentrating more on the batch operation of the utility without cluttering the output in case of an error.
    The last line opens an else block in which the main processing takes place.

    Ver. 1, No. 12-17
    } else {	
    	set startip [lindex $argv 0]
    	set countip [lindex $argv 1]
    	set getcountip $countip	
    	if {[regexp $exip $startip]} then {
    		set octsip [split $startip {.}]
    

    In this part, we first save the parameters received at the input to the startip variables - the first parameter, countip - the second parameter, if there is no second parameter then lindex will return an empty string to us. We also save the second parameter in the additional variable getcountip .
    Next, we check that the first parameter matches the correct IP address using regexp and the exip variable with the regular expression specified earlier. In this condition, it is important that the IPv4 address is fully consistent with what we expect, since the next line we create the octsip list with split, a dot character acts as a separator. The resulting list must contain only decimal digits from 0 to 255 in the right places without leading zeros in order to operate with them without further additional checks. Leading zeros play a role here, insofar as, for example, the number 011, when substituted, will be perceived as an octal number, that is, it will be 9 in the decimal number system.
    It is worth paying attention to the fact that a query in a search engine quite often leads to regular expressions that do not check all these conditions, often it’s just a check for 4 groups of 3 digits in a group. For example, an expression from habrahabr.ru/blogs/webdev/123845- allows the construction of 000.100.1.010, which of course is the IP address, but does not unambiguously determine the octal or decimal form of its entry, this introduces uncertainty and requires further verification.

    Ver. 1, No. 18-35
    if {[regexp $exip $countip]} then {
    	set octfip [split $countip {.}]			
    	set octsub {0}			
    	set countip {0}
    	for {set i 3} {$i>=0} {incr i -1} {
    		if {[set octsub [expr [lindex $octfip $i] - [lindex $octsip $i]]] < 0} then {
    		    if {$i > 0} then {
    			  set si [expr $i - 1]
    			  set octfip [lreplace $octfip $si $si [expr [lindex $octfip $si] - 1]]
    			  set octsub [expr 256 + $octsub]
    		    } else {						
    			  break
    			}	
    		}
    		set ni [expr 3 - $i]
    		set countip [expr $countip + ($octsub * (1 << ($ni * 8)))] 
    	}
    }
    

    Here we check whether the second parameter is a valid IPv4 address and if so, we try to calculate the difference between this address and the one specified in the first parameter, that is, at the output of this block, the countip variable must contain the correct value of the list length. The check of the if condition is embedded in the higher check (line 16), so if the previous check fails (the first parameter is not an IPv4 address), then the program will not reach this section.
    The solution to this subtask (subtracting IPv4 addresses) is done as if we subtracted two numbers in a column:
    192.168.3. 1
    -192.168.2. 10
    = 0. 0.0.247
    

    Of course, these numbers are not decimal, but on the basis of 256. That is, when we occupy the value from the previous digit, we must add not 10, but 256 (0x100) in the hexadecimal representation of octets, this looks more clearly:
     0xC0.0xA8.0x03.0x01
    -0xC0.0xA8.0x02.0x0A
    ---------------------- (we carry out the loan, remove the unit from the senior level)
    = 0xC0.0xA8.0x03.0x01
             -.0x01
    ---------------------- (continue the loan, adjust the result to the value of the base of the number system)
    -0xC0.0xA8.0x02.0x00A
                  + .0x100
    ---------------------- (the result is such an operation)
    = 0xC0.0xA8.0x02.0x101
    -0xC0.0xA8.0x02.0x00A
    = 0x00.0x00.0x00.0x0F7
    

    To implement this, we, as well as with the first parameter, create the otcfip list , then create the octsub variable which will contain the difference between the octets and zero the countip which will store the size of the address list, and not the IPv4 address that was there when we entered this block of code.
    We organize the for loop with the variable i in the reverse order from 3 to 0. In this loop, we must go through all the octets of the IPv4 address starting from the lowest, that is, all the elements of the octsip and octfip lists starting from the last element.
    We calculate the difference between the current octets octfip-octcip and save it in octsub. I really wanted to use arrays here, because the construction with lindex is very cumbersome, but I did not see a simple way to form an array (in TCL it is only associative) from the list, so only lists are everywhere. The calculated difference is immediately checked for a condition less than 0, that is, do we need to make a loan from a high octet or not?
    If you need to make a loan and this is not the oldest octet ( i is greater than 0), then add 256 to the difference octsub and subtract 1 from the next octet of the decremented one ( octfip ).
    If this is the oldest octet, then we have the decrementable octfip less than the subtracted octsip, that is, the difference turns out to be negative, which cannot be by the condition of the problem, in this case we exit the cycle - break
    If you do not need to make a loan, then the result satisfies us. However, the result is in any case presented in a number system with a base of 256, which is not convenient, since we need to do further calculations in a system that the interpreter understands. Therefore, we translate the result obtained in the standard way for the positional number system:
    ... + A i * B i ... + A 3 * B 3 + A 2 * B 2 + A 1 * B 1 + A 0 * B 0 = N Bwhere B is the base of the number system.
    In our case: countip = octsub ni * 256 ni , where ni varies from 0 to 3, or ni = 3- i , where i changes from 3 to 0, which allows you to include the translation in an existing cycle. Since the base of the number system we have is a multiple of 2, we use a multiple of 8 to calculate the degree, since 256 is 100000000 in binary representation, that is, the unit is shifted 8 bits to the left. Thus, shifting first by 0, then by 8 (the last significant line in this section of the code), then by 16 and by 24, we thereby multiply by 1 (256 0 ), 256, 256 2 and 256 3 .
    Returning to this site for the second time, it caused me some bewilderment, which seemed simple and understandable during implementation, now it seemed unnecessarily confusing and complicated. This can be judged even by the amount of text describing this code.
    What is wrong? Why did you need, again !, to invent a subtraction operation, albeit for numbers in the number system with a base of 256, instead of translating these values ​​into a numerical form understandable to the programming language, and using standard means to do the subtraction, especially since we still make the translation? In the end, for myself, I came to the conclusion that subjective human perception once again played a cruel joke. What could be easier to perform actions in the column, which take place in the first class? Nothing, because this is the first thing everyone is taught after the actual numbers. Translations, shifts, seem complicated, compared to the simplest operation from the first class. Understanding that this is not a decimal number system comes a little later, but understanding the incorrectness of what is happening only when I had to look at the written code a second time.

    Ver.2, No.18-25
    if {[regexp $exip $countip]} then {
    	set octfip [split $countip {.}]
    	set nfip [expr ([lindex $octfip 0] * 0x1000000) + ([lindex $octfip 1] * 0x10000)\
     			+ ([lindex $octfip 2] * 0x100) + ([lindex $octfip 3])]
    	set nsip [expr ([lindex $octsip 0] * 0x1000000) + ([lindex $octsip 1] * 0x10000)\
     			+ ([lindex $octsip 2] * 0x100) + ([lindex $octsip 3])]	
    	if {$nfip >= $nsip} then {set countip [expr $nfip - $nsip]}
    }
    

    After forming the octfip list (as in the first version), we form the numbers corresponding to the IPv4 address value (which they are) in the nfip variable for the address in the second argument and nsip variable for the address in the first argument. We do the translation just like we did, only without any cycles, substituting the values ​​in one line: nfip = listitem 0 * 256 3 + listitem 1 * 256 2 + listitem 2 * 256 + listitem 3 , where listitem n is the corresponding list element to be counted directly in expression using lindex. 256, to some extent, in the code is presented in the form of round hexadecimal values ​​0x100xxxx for simplicity of perception. Then we check that the second argument is greater than the first and subtract the first from the second, saving the value of the result in countip .
    As a result, it turned out a little easier than it was, even much simpler. The only thing that bothers me with this option is the hypothetical possibility of overflowing the nfip and nsip variables in expr calculations . Although for current C compilers, this should not be scary. From the documentation regarding computations and overflows http://www.tcl.tk/man/tcl8.5/TclCmd/expr.htm#M23 . For version 8.4 www.tcl.tk/man/tcl8.4/TclCmd/expr.htm#M5it was clearly indicated that numerical constants are 32-bit signed numbers, which, if necessary, will be interpreted as 64-bit signed ones, for version 8.5 this is not mentioned. In the previous version, a hypothetical possibility of overflow was also present, but there we processed the resulting difference, which in real cases would be much less than even a 16-bit number.
    Next, the second part of the utility begins, in which an output list of IPv4 addresses is formed.

    Ver.2, No.26-27
    if {[regexp $exdg $countip]} then {
    	puts $startip
    

    We check the countip variable for compliance with the numerical value from 0 to 999999. The value of this variable can be passed in the second argument, that is, the previous check for its belonging to IPv4 address failed. Or already calculated the difference between the addresses specified in the arguments. If the value of this variable is too large, or does not correspond to the number at all (this may be after our calculations, for example, if the difference in IPv4 addresses is negative), then further processing will not be performed. If everything is in order, then we display the first element from the list (IPv4 address specified by the first argument). Further, I will call the resulting list of IPv4 addresses a sequence, so as not to get confused with the internal concept of TCL - a list.

    Ver.2, No.28-29
    for {set i 0} {$i<$countip} {incr i 1} {
    	set octsip [lreplace $octsip {3} {3} [expr [lindex $octsip {3}] + 1]] 
    

    We form the rest of the elements of the desired sequence, again I really want to use arrays, but the translation from the list to the array seems to me worse than using lists in this form (how can this be done correctly and simply?). Here is a for loop for variable i running values ​​from 0 to the maximum calculated (or given) element of the countip sequence . Inside the loop, the last element of the previously formed octsip list (the lowest octet in our address) is increased by 1 ... Ver .

    2, No. 30-36
    for {set j 3} {$j>=0} {incr j -1} {
    	if {[lindex $octsip $j] > 255 && $j > 0} then {
    		set sj [expr $j - 1]
    		set octsip [lreplace $octsip $j $j {0}]
    		set octsip [lreplace $octsip $sj $sj [expr [lindex $octsip $sj] + 1]]
    	}
    }
    

    … и проверяем не нужно ли корректировать другие разряды. Для чего также организуем цикл for с переменной j пробегающей значения от 3 до 0. Далее в условии if делаем проверку на то что текущий октет больше 255 (произошло переполнение) и это не старший октет j больше 0, но не равна 0. Если переполнение произошло, текущий октет обнуляем, в старший октет (что соответствует элементу списка octsip ближе к его началу) добавляем 1. В случае если переполнение произошло в старшем октете, то корректировку не делаем, таким образом чтобы у нас остался неправильный IPv4 адрес.

    Вер.2, №37-44
    	set oip [join $octsip {.}]
    	if {[regexp $exip $oip]} then {
    		puts $oip
            } else {
    		puts "СТОП: Достигнут максимальный возможный адрес"
    		exit 3
    	}
    }
    

    We merge the resulting list containing the octets of our address together, join into the oip variable , the separator is a period. Next, we check the result for belonging to the IPv4 address, using our regular expression given at the very beginning. If everything is correct - we deduce, if not, then an overflow has occurred or another error is already in the process of forming the sequence, exit exit abnormally . This moment is also not entirely beautiful, since we have several exit points, which can be inconvenient if we want, for example, to perform the same actions at the end.
    The last closing bracket is the end of the for loop , which forms the output sequence and is open on line 28.

    Ver.2, No. 45-51
    		} else {
    			puts "Неверно задан второй аргумент \"$getcountip\""			
    		}
    	} else {
    		puts "Неверно задан начальный IP адрес \"$startip\""
    	}
    }
    

    The final lines in which we display error messages on the else branches for conditions of 26 and 16 lines, where we check the arguments given at the program startup for compliance with expectations. This is the only place where the getcountip variable is used, which stores the second received argument of the program in an unchanged form, which is strange and seems overkill, but it was impossible to implement an obvious (simple) other option in this case.
    Looking through this part of the program for the second time (where the sequence for output is formed), I thought at first that it would be nice to implement a full adder of 4-digit numbers in the base number system 256 and a translator into additional code of the same numbers so that you could do subtraction on the same adder. I had not yet changed the first part, and was dominated by ideas about the simplicity of calculations in a column. The desire to implement this (wild) venture did not pass, since it is interesting in itself, but maybe not on TCL. It was already clear that the second part should be changed in the same vein as the first, that is, to translate from the usual representation to the one we need (and this is already a translation into the 256-decimal number system).
    The concept of enumeration has also changed, if we can iterate over IPv4 addresses using the language language in a for loop, then why don’t we need to calculate the size of this sequence in advance, we will simply move from one address in a row to another. Also, in this approach, it turned out to be very simple for us to move not only in the forward direction from the smaller to the larger, but also in the opposite direction - this does not require additional efforts, we just need to correctly set the increment of the loop variable during its formation (here it is possible for additional functionality that allows form a sequence in any direction).

    Ver.3, No.31-32
    if {$nfip > $minip && $nfip < $maxip} then {
    			if {[set d [expr $nfip >= $nsip]]} then {set di {1}} else {set di {-1}}
    

    We check the ownership of nfip , which, I recall, contains the second argument IPv4 address as a number given to the range ( minip and maxip are determined at the beginning of the program). If we fall into the range, then set the direction of iteration, if the second IPv4 address nfip is greater than the first nsip (we already have the address in the form of numbers), then iteration in the direct order is the variable di = 1, if it is less, then iteration in the reverse order, di = -1. The result of the comparison is also remembered in d .

    Ver.3, No. 33-37
    for {set i $nsip} {($i<=$nfip && $d) || ($i>=$nfip && !$d)} {incr i $di} {
    	set octip [list [expr ($i & 0xFF000000) >> 24] [expr ($i & 0xFF0000) >> 16]\
     		[expr ($i & 0xFF00) >> 8] [expr ($i & 0xFF)]]				
    	puts [join $octip {.}]				
    }
    

    We organize the for loop in the variable i whose initial value is set to nsip , and we adjust the exit condition to the condition nfip> = nsip , the result of which we store in d : i <= nfip if we approach nfip from below, or else i> = nfip . The increment i is already calculated and stored in di .
    In the body of the loop, we form the octsip listof octets IPv4 addresses. That is, we need to form the address in decimal notation from its numerical representation - translate it into a 256-decimal number system. In the general case, following the theory, we need to divide the number in one number system, on the basis of another number system and from the residues form a number in the new number system (on the basis of which we divide):
      3 221 225 985 | 256
     -3 221 225 984 | -----------
     -------------- | 12,582,914 | 256
                  1 | -12 582 912 | -------
                       ---------- | 49 152 | 256
                                2 | -49 152 | ---
                                          0 | 192                                                     
    

    Starting from result 192, for all residues in the reverse order of 0, 2, 1 we get 192.0.2.1. Division is a complex operation and does not bring any optimization, but in our very special case: IPv4 address and division by 256 - everything turns out very simple. We will shift by 8 (divided by powers of 256) and mask the bits we do not need (binary operation "AND"). Imagine in hexadecimal form:
       0xC0000201 | 0xC0000201 | 0xC0000201 | 0xC0000201
      & 0xFF000000 | & 0x00FF0000 | & 0x0000FF00 | & 0x000000FF
     -------------------------------------------------- -------------------
       0xC0000000 >> 24 | 0x00000000 >> 16 | 0x00000200 >> 8 | 0x00000001
     -------------------------------------------------- -------------------
     = 0xC0 (192) | = 0x00 (0) | = 0x02 (2) | = 0x01 (1)
    

    All this is done in one line, each bit is placed in its own list element list . The second statement in the body of the loop, displays a combined list.

    Ver.3, No. 38-45
    		} else {
    			puts "Последний IP списка выходит за границы допустимого диапазона"
    			exit 3
    		}
    	} else {
    		puts "Неверно задан начальный IP адрес \"$startip\""
    	}
    }		
    

    The final lines are almost no different, in addition, the conclusion about the error in setting the second argument is shifted slightly higher in the program. The first part of the program, in comparison with the second option, has also changed a little.

    Ver.3, No. 1-7
    #!/usr/bin/tclsh8.5
    set exip {^(2(5[0-5]|[0-4]\d)|1\d{2}|[1-9]\d{0,1})(\.(2(5[0-5]|[0-4]\d)|1\d{2}|[1-9]\d|\d)){3}$}
    set exdg {^-?(0?|([1-9]\d*))$}
    set maxip {0xFFFFFFFF}
    set minip {0xFFFFFF}
    

    The regular expression for checking a numerical parameter now returns a positive answer for any numerical values ​​of any length, without leading 0, but with a possible negative sign “-” in front. Here we simplified the check and expanded the boundaries, because the length of the resulting sequence is checked in numerical form using the following variables maxip and minip . These values ​​do not duplicate the exip regular expression , since it now only checks the correctness of user input, and not the calculation results.

    Ver.3, No.15-20
    set startip [lindex $argv 0]
    if {![string length [set finiship [lindex $argv 1]]]} then {set finiship {0}}	
    if {[regexp $exip $startip]} then {
    	set octsip [split $startip {.}]
    	set nsip [expr ([lindex $octsip 0] * 0x1000000) + ([lindex $octsip 1] * 0x10000)\
     		+ ([lindex $octsip 2] * 0x100) + ([lindex $octsip 3])]
    

    Lines 8-14 almost completely repeat lines 6-12 of the first option, only slightly corrected messages in accordance with the new functionality. Then we perform almost the same actions as in the second option. The only thing we forcibly set the finiship value to 0, if the second argument was not given, so that the variable was always defined. finiship has the same meaning as countip from the second option, and has been renamed to fit the new concept. Ultimately, this variable will not contain the size of the sequence of IPv4 addresses, but the last address from this sequence. We compute nsip immediately after we decompose the first argument into its constituents.

    Ver.3, No.21-30
    if {[regexp $exip $finiship]} then {
    	set octfip [split $finiship {.}]
    	set nfip [expr ([lindex $octfip 0] * 0x1000000) + ([lindex $octfip 1] * 0x10000)\
     		+ ([lindex $octfip 2] * 0x100) + ([lindex $octfip 3])]			
    } elseif {[regexp $exdg $finiship] && [expr abs($finiship)] < $maxip} then {
    	set nfip [expr $nsip + $finiship]
    } else {
    	puts "Неверно задан второй аргумент \"$finiship\""
    	exit 5
    }		
    

    The first condition is the same as in the second option, the mentioned change - the calculation of the first argument was moved a little higher in the code.
    The second elseif condition is to check the second argument for compliance with the numerical value, and this check is not performed in any case as in the first cases, but only if the second argument is not an IPv4 address, since further calculations are performed with the addresses and not with the sequence length. If it is a number, then nfip is calculated by summing this number with nsip . Verification of the correctness of the received IPv4 address will be done numerically further down the code (as described above).
    If the second argument matches neither the number nor the IPv4 address, we abort exit. Again, we obtain a multiplicity of exit points, and in this version there are three of them. This can be overcome by wrapping the entire program in an endless loop and, if necessary, interrupting it break and ending up at the end of the program, but this was not necessary at all. As mentioned earlier, all error checks can be excluded by replacing them with the default actions, this is more suitable for the command mode. In this version there is no getcountip variable - the error message contains the directly received second finiship argument , since it does not change, but is only used as you work.
    The logic of the utility (cipl.tl) has expanded, now the IPv4 address from which we will start does not have to be less than the IPv4 address we end with, the list will be built in any case from the first to the second. You can also set the second parameter to a negative number, in which case the list will be in reverse order.
    > cipl.tl 192.0.2.1 -1
    192.0.2.1
    192.0.2.0
    > cipl.tl 192.0.2.1 192.0.2.0
    192.0.2.1
    192.0.2.0
    

    That's all the misadventures, the description greatly exceeded the code itself, and the whole situation with the original Varant is confirmation of the topic habrahabr.ru/blogs/complete_code/135340 , but in the end it turned out not quite bad, as they say: "Measure seven times, cut one," plus a working utility.

    Full program options: cipl.zip
    About number systems you can read on wikibooks - en.wikibooks.org/wiki/Numbering systems
    All you need to know about TCL is on www.tcl.tk/doc and on wiki.tcl.tk

    Also popular now: