Type Control of Function Arguments in Lua

    Task


    Lua is a dynamic typing language.

    This means that the type in the language is not associated with a variable, but with its value:

    a = "the meaning of life" --> была строка,

    a = 42                    --> стало число

    It's comfortable.

    However, there are often cases when you want to tightly control the type of a variable. The most common case is checking function arguments.

    Consider a naive example:

    function repeater(n, message)

      for i = 1, n do

        print(message)

      end

    end

     

    repeater(3, "foo") --> foo

                       --> foo

                       --> foo

    If we confuse the arguments of the repeat function, we get a runtime error:

    > repeater("foo", 3)
    stdin:2: 'for' limit must be a number
    stack traceback:
    	stdin:2: in function 'repeater'
    	stdin:1: in main chunk
    	[C]: ?
    

    "What is such a for ?!" - the user of our function will say when he sees this error message.

    The function suddenly ceased to be a black box. The viscera became visible to the user.

    It will be even worse if we accidentally forget to pass the second argument:

    > repeater(3)
    nil
    nil
    nil
    

    No errors occurred, but the behavior is potentially incorrect.

    This is due to the fact that in Lua inside functions, non-passed arguments turn into nil.

    Another typical error occurs when calling methods of objects:

    foo = {}

    function foo:setn(n)

      self.n_ = n

    end

    function foo:repeat_message(message)

      for i = 1, self.n_ do

        print(message)

      end

    end

     

    foo:setn(3)

    foo:repeat_message("bar") --> bar

                              --> bar

                              --> bar

    The colon is syntactic sugar, implicitly passing the object itself as the first argument, self. If we remove all the sugar from the example, we get the following:

    foo = {}

    foo.setn = function(self, n)

      self.n_ = n

    end

    foo.repeat_message = function(self, message)

      for i = 1, self.n_ do

        print(message)

      end

    end

     

    foo.setn(foo, 3)

    foo.repeat_message(foo, "bar") --> bar

                                   --> bar

                                   --> bar

    If during the method call you write a period instead of a colon, self will not be passed:

    > foo.setn(3)
    stdin:2: attempt to index local 'self' (a number value)
    stack traceback:
    	stdin:2: in function 'setn'
    	stdin:1: in main chunk
    	[C]: ?
    > foo.repeat_message("bar")
    stdin:2: 'for' limit must be a number
    stack traceback:
    	stdin:2: in function 'repeat_message'
    	stdin:1: in main chunk
    	[C]: ?
    

    Let's get a little distracted


    If in the case of setn the error message is clear enough, then the error with repeat_message at first glance looks mystical.

    What happened? Let's try to look more closely at the console.

    In the first case, we write in a number the value at the index "n_":

    > (3).n_ = nil
    

    To which we were completely legally answered:

    stdin:1: attempt to index a number value
    stack traceback:
    	stdin:1: in main chunk
    	[C]: ?
    

    In the second case, we tried to read the value from the string at the same index "n_".

    > return ("bar").n_
    nil
    

    Everything is simple. A metatable is attached to the string type in Lua, redirecting indexing operations to the string table.

    > return getmetatable("a").__index == string
    true
    

    This allows you to use shorthand for strings. The following three options are equivalent:

    a = "A"

    print(string.rep(a, 3)) --> AAA

    print(a:rep(3))         --> AAA

    print(("A"):rep(3))     --> AAA

    Thus, any operation of reading an index from a row is accessed by the string table .

    It’s good that the recording is disabled:

    > return getmetatable("a").__newindex          
    nil
    > ("a")._n = 3
    stdin:1: attempt to index a string value
    stack traceback:
    	stdin:1: in main chunk
    	[C]: ?
    

    There is no our key “n_” in the string table - therefore, for and swears that they slipped nil instead of the upper bound:

    > for i = 1, string["n_"] do
    >>  print("bar")
    >> end
    stdin:1: 'for' limit must be a number
    stack traceback:
    	stdin:1: in main chunk
    	[C]: ?
    

    But we were distracted.

    Decision


    So, we want to control the argument types of our functions.

    It's simple, let's check them out.

    function repeater(n, message)

      assert(type(n) == "number")

      assert(type(message) == "string")

      for i = 1, n do

        print(message)

      end

    end

     

    Let's see what happened:

    > repeater(3, "foo")
    foo
    foo
    foo
    > repeater("foo", 3)
    stdin:2: assertion failed!
    stack traceback:
    	[C]: in function 'assert'
    	stdin:2: in function 'repeater'
    	stdin:1: in main chunk
    	[C]: ?
    > repeater(3)
    stdin:3: assertion failed!
    stack traceback:
    	[C]: in function 'assert'
    	stdin:3: in function 'repeater'
    	stdin:1: in main chunk
    	[C]: ?
    

    Already closer to the point, but not very clear.

    We fight for clarity


    Let's try to improve error messages:

    function repeater(n, message)

      if type(n) ~= "number" then

        error(

            "bad n type: expected `number', got `" .. type(n) 

            2

          )

      end

      if type(message) ~= "string" then

        error(

            "bad message type: expected `string', got `"

            .. type(message) 

            2

          )

      end

     

      for i = 1, n do

        print(message)

      end

    end

    The second parameter of the error function is the level on the call stack to which you want to show in stack trace. Now it’s not our function that is “to blame”, but the one who called it.

    The error messages are much better:

    > repeater(3, "foo")
    foo
    foo
    foo
    > repeater("foo", 3)
    stdin:1: bad n type: expected `number', got `string'
    stack traceback:
    	[C]: in function 'error'
    	stdin:3: in function 'repeater'
    	stdin:1: in main chunk
    	[C]: ?
    > repeater(3)
    stdin:1: bad message type: expected `string', got `nil'
    stack traceback:
    	[C]: in function 'error'
    	stdin:6: in function 'repeater'
    	stdin:1: in main chunk
    	[C]: ?
    

    But now error handling takes five times the useful part of the function.

    Fighting for brevity


    Take out the error handling separately:

    function assert_is_number(v, msg)

      if type(v) == "number" then

        return v

      end

      error(

          (msg or "assertion failed") 

          .. ": expected `number', got `" 

          .. type(v) .. "'",

          3

        )

    end

     

    function assert_is_string(v, msg)

      if type(v) == "string" then

        return v

      end

      error(

          (msg or "assertion failed") 

          .. ": expected `string', got `" 

          .. type(v) .. "'",

          3

        )

    end

     

    function repeater(n, message)

      assert_is_number(n, "bad n type")

      assert_is_string(message, "bad message type")

     

      for i = 1, n do

        print(message)

      end

    end

    This can already be used.

    A more complete implementation of assert_is_ * is here: typeassert.lua .

    Work with methods


    We will redo the implementation of the method:

    foo = {}

    function foo:setn(n)

      assert_is_table(self, "bad self type")

      assert_is_number(n, "bad n type")

      self.n_ = n

    end

    The error message looks a little confusing:

    > foo.setn(3)
    stdin:1: bad self type: expected `table', got `number'
    stack traceback:
    	[C]: in function 'error'
    	stdin:5: in function 'assert_is_table'
    	stdin:2: in function 'setn'
    	stdin:1: in main chunk
    	[C]: ?
    

    An error using a period instead of a colon when calling a method is very common, especially for inexperienced users. Practice shows that in a message for checking self it is better to point directly to it:

    function assert_is_self(v, msg)

      if type(v) == "table" then

        return v

      end

      error(

          (msg or "assertion failed")

          .. ": bad self (got `" .. type(v) .. "'); use `:'",

          3

        )

    end

     

    foo = {}

    function foo:setn(n)

      assert_is_self(self)

      assert_is_number(n, "bad n type")

      self.n_ = n

    end

    Now the error message is as clear as possible:

    > foo.setn(3)
    stdin:1: assertion failed: bad self (got `number'); use `:'
    stack traceback:
    	[C]: in function 'error'
    	stdin:5: in function 'assert_is_self'
    	stdin:2: in function 'setn'
    	stdin:1: in main chunk
    	[C]: ?
    

    We have achieved the desired result in terms of functionality, but is it still possible to increase usability?

    Increase usability


    I want to clearly see in the code what type each argument should be. Now the type is wired into the function name assert_is_ * and is not very distinguished.

    It’s better to be able to write like this:

    function repeater(n, message)

      arguments(

          "number", n,

          "string", message

        )

     

      for i = 1, n do

        print(message)

      end

    end

    The type of each argument is clearly highlighted. Less code needed than with assert_is_ *. The description even resembles Old Style C function declarations (they are also called K & R-style):

    void repeater(n, message)

      int n;

      char * message;

    {

      /* ... */

    }

    But back to Lua. Now that we know what we want, this can be realized.

    function arguments(...)

      local nargs = select("#", ...)

      for i = 1, nargs, 2 do

        local expected_type, value = select(i, ...)

        if type(value) ~= expected_type then

          error(

              "bad argument #" .. ((i + 1) / 2)

              .. " type: expected `" .. expected_type

              .. "', got `" .. type(value) .. "'",

              3

            )

        end

      end

    end

    Let's try what happened:

    > repeater("bar", 3)
    stdin:1: bad argument #1 type: expected `number', got `string'
    stack traceback:
    	[C]: in function 'error'
    	stdin:6: in function 'arguments'
    	stdin:2: in function 'repeater'
    	stdin:1: in main chunk
    	[C]: ?
    > repeater(3)
    stdin:1: bad argument #2 type: expected `string', got `nil'
    stack traceback:
    	[C]: in function 'error'
    	stdin:6: in function 'arguments'
    	stdin:2: in function 'repeater'
    	stdin:1: in main chunk
    	[C]: ?
    

    disadvantages


    We have lost a custom error message, but it’s not so scary - in order to understand which argument we are talking about, its number is enough.

    Our function does not have enough checks for the correctness of the call itself - the fact that an even number of arguments is passed, and that all types are correct. The reader is invited to add these checks on their own.

    Work with methods


    The option for methods differs only in that we must additionally check self:

    function method_arguments(self, ...)

      if type(self) ~= "table" then

        error(

            "bad self (got `" .. type(v) .. "'); use `:'",

            3

          )

       end

      arguments(...)

    end

     

    foo = {}

    function foo:setn(n)

      method_arguments(

          self,

          "number", n

        )

      self.n_ = n

    end

    The full implementation of the * arguments () family of functions can be viewed here: args.lua .

    Conclusion


    We have created a convenient mechanism for checking function arguments in Lua. It allows you to visually set the expected types of arguments and effectively check the correspondence of the passed values ​​to them.

    The time spent on assert_is_ * will not be wasted either. Arguments of functions are not the only place in Lua where types need to be controlled. Using the functions of the assert_is_ * family makes such control more visual.

    Alternatives


    There are other solutions. See the Lua Type Checking in the Lua-users wiki . The most interesting is the solution with decorators :

    random =

      docstring[[Compute random number.]] ..

      typecheck("number", '->', "number") ..

      function(n)

        return math.random(n)

      end

    Metalua includes a types extension for describing variable types ( description ).

    With this extension you can do like this:

    -{ extension "types" }

     

    function sum (x :: list(number)) :: number

      local acc :: number = 0

      for i=1, #x do acc=acc+x[i] end

      return acc

    end

    But this is not quite Lua. :-)

    Also popular now: