One simple task. Fast, beautiful or clean?

    I believe that 99% of Python developers solved this problem in one way or another, since it is part of the standard set of tasks that he offers for job seekers of Python Developer at one well-known company.

    # Есть два списка разной длины. В первом содержатся ключи, а во втором значения.# Напишите функцию, которая создаёт из этих ключей и значений словарь.# Если ключу не хватило значения, в словаре должно быть значение None.# Значения, которым не хватило ключей, нужно игнорировать.

    Below for the sake of curiosity, I gave a list of the solutions I analyzed

    To evaluate the correctness of the code, I sketched a simple unit test.

    Option 1

    import unittest
    defdict_from_keys(keys, values):
        res = dict()
        for num, key in enumerate(keys):
            try:
                res[key] = values[num]
            except IndexError:
                res[key] = Nonereturn res
    classDictFromKeysTestCase(unittest.TestCase):deftest_dict_from_keys_more_keys(self):
            keys = range(1000)
            values = range(900)
            for _ in range(10 ** 5):
                result = dict_from_keys(keys, values)
            self.assertEqual(keys,result.keys())
        deftest_dict_from_keys_more_values(self):
            keys =range(900)
            values = range(1000)
            for _ in range(10 ** 5):
                result = dict_from_keys(keys, values)
            self.assertEqual(keys, result.keys())
    

    Here I gave the first solution I found. Run the unit test:

    random1st@MacBook-Pro ~/P/untitled> python -m unittest dict_from_keys
    ..
    ----------------------------------------------------------------------
    Ran 2 tests in26.118s
    OK

    What was the first moment I noticed right away? Using dict () is a function call, while using {} is a syntax construct. Replace dictionary initialization:

    random1st@MacBook-Pro ~/P/untitled> python -m unittest dict_from_keys
    ..
    ----------------------------------------------------------------------
    Ran 2 tests in25.828s
    OK

    A trifle, but nice. Although it can be attributed to the error. From the next option, I cried for blood, but still bring it here:

    Option 2

    defdict_from_keys(keys, values):
        res = {}
        it = iter(values)
        nullValue = Falsefor key in keys:
            try:
                res[key] = it.next() ifnot nullValue elseNoneexcept StopIteration:
                nullValue = True
                res[key] = Nonereturn res
    

    Test result:

    random1st@MacBook-Pro ~/P/untitled> python -m unittest dict_from_keys
    ..
    ----------------------------------------------------------------------
    Ran 2 tests in33.312s
    OK
    

    No comments.

    Option 3

    The following solution:

    defdict_from_keys(keys, values):return {key: Noneif idx>=len(values) else values[idx] for idx, key in enumerate(keys)}
    

    Test result:

    random1st@MacBook-Pro ~/P/untitled [1]> python -m unittest dict_from_keys
    ..
    ----------------------------------------------------------------------
    Ran 2 tests in26.797s
    OK
    

    As you can see, significant acceleration was not achieved. Another variation on the theme:

    Option 4

    defdict_from_keys(keys, values):return dict((len(keys) > len(values)) and map(None, keys, values) or zip(keys, values))
    

    Result:

    random1st@MacBook-Pro ~/P/untitled> python -m unittest dict_from_keys 
    ..
    ----------------------------------------------------------------------
    Ran 2 tests in20.600s
    OK

    Option 5

    defdict_from_keys(keys, values):
        result = dict.fromkeys(keys, None)
        result.update(zip(keys, values))
        return result
    

    Result:

    random1st@MacBook-Pro ~/P/untitled [1]> python -m unittest dict_from_keys
    ..
    ----------------------------------------------------------------------
    Ran 2 tests in17.584s
    OK

    It is expected that the use of built-in functions gives significant acceleration. Is it possible to achieve an even more impressive result?

    Option 6

    defdict_from_keys(keys, values):return dict(zip(keys, values + [None] * (len(keys) - len(values))))
    

    Result:

    random1st@MacBook-Pro ~/P/untitled> python -m unittest dict_from_keys
    ..
    ----------------------------------------------------------------------
    Ran 2 tests in14.212s
    OK

    Faster:

    Option 7

    defdict_from_keys(keys, values):return dict(itertools.izip_longest(keys, values[:len(keys)]))
    

    Result:

    random1st@MacBook-Pro ~/P/untitled> python -m unittest dict_from_keys
    ..
    ----------------------------------------------------------------------
    Ran 2 tests in10.190s
    OK

    The reasonable question arises whether it is possible to get something faster than this solution. Obviously, if calculations cannot be done faster, they should be made lazy. The doubtfulness of this option is obvious, but now it all depends on the context. In particular, the following code passes tests for itself, but this is no longer a completely Python dictionary:

    Option 8

    classDataStructure(dict):def__init__(self, *args, **kwargs):
            super(DataStructure, self).__init__(*args, **kwargs)
            self._values = None
            self._keys = None    @classmethoddefdict_from_keys_values(cls, keys, values):
            obj = cls()
            obj._values = values[:len(keys)]
            obj._keys = keys
            return obj
        def__getitem__(self, key):try:
                return super(DataStructure, self).__getitem__(key)
            except KeyError:
                try:
                    idx = self._keys.index(key)
                    self._keys.pop(idx)
                    super(DataStructure, self).__setitem__(
                        key, self._values.pop(idx)
                    )
                except ValueError:
                    raise KeyError
                except IndexError:
                    super(DataStructure, self).__setitem__(key, None)
                return super(DataStructure, self).__getitem__(key)
        defkeys(self):for k in self._keys:
                yield k
            for k in super(DataStructure, self).keys():
                yield k
    

    random1st@MacBook-Pro ~/P/untitled [1]> python -m unittest dict_from_keys 
    ..
    ----------------------------------------------------------------------
    Ran 2 tests in1.219s
    OK

    I’ll add that I personally am most impressed with the 6th option, both in terms of readability and speed.
    PS Once again, I was struck by the number of commentators on an absolutely useless article.

    Only registered users can participate in the survey. Please come in.

    Which option seems most preferable to you:


    Also popular now: