Python import automation

    BeforeAfter
    import math
    import os.path
    import requests
    # 100500 other imports
    print(math.pi)
    print(os.path.join('my', 'path'))
    print(requests.get)
    import smart_imports
    smart_imports.all()
    print(math.pi)
    print(os_path.join('my', 'path'))
    print(requests.get)
    
    It so happened that since 2012 I have been developing an open source browser, being the only programmer. In Python by itself. The browser is not the easiest thing, now in the main part of the project there are more than 1000 modules and more than 120,000 lines of Python code. In total, it will be one and a half times more with satellite projects.

    At some point, I was tired of messing with the floors of imports at the beginning of each file, and I decided to deal with this problem once and for all. So the smart_imports library was born ( github , pypi ).

    The idea is quite simple. Any complex project eventually forms its own agreement on naming everything.If this agreement is turned into more formal rules, then any entity can be imported automatically by the name of the variable associated with it.

    For example, you don’t need to write import mathto turn to math.pi- we can already understand that in this case math- the standard library module.

    Smart imports support Python> = 3.5. The library is fully covered by tests, coverage> 95% . I’ve been using it myself for a year now.

    For details, I invite you to Cat.

    How does it work in general?


    So, the code from the header image works as follows:

    1. During a call, the smart_imports.all()library builds the AST of the module from which the call is made;
    2. Find uninitialized variables;
    3. We run the name of each variable through a sequence of rules that try to find the module (or module attribute) that is needed for import by name. If a rule has found the required entity, the following rules are not checked.
    4. The found modules are loaded, initialized and placed in the global namespace (or the necessary attributes of these modules are placed there).

    Uninitialized variables are searched all over the code, including the new syntax.

    Auto-import is enabled only for those project components that are explicitly called smart_imoprts.all(). In addition, the use of smart imports does not prohibit the use of conventional imports. This allows you to implement the library gradually, as well as resolve complex cyclic dependencies.

    The meticulous reader will notice that the AST module is constructed twice:

    • CPython builds it for the first time during module import;
    • the second time smart_imports builds it during a call smart_imports.all().

    AST really can be built only once (for this you need to integrate into the import process of modules using import hooks implemented in PEP-0302 , but this solution slows down the import.

    Why do you think so?
    Comparing the performance of two implementations (with and without hooks), I came to the conclusion that when importing a module, CPython builds AST in its internal (C-shh) data structures. Converting them to Python data structures is more expensive than building a tree from the source using the ast module .

    Of course, the AST of each module is built and analyzed only once per launch.

    Default Import Rules


    The library can be used without additional configuration. By default, it imports modules according to the following rules:

    1. By exact coincidence of the name, it searches for the module next to the current one (in the same directory).
    2. Checks the modules of the standard library:
      • by exact match of the name for top-level packages;
      • for nested packages and modules, checks for compound names, replacing dots with underscores. For example, it os.pathwill be imported if there is a variable os_path.
    3. By exact match of the name, it searches for installed third-party packages. For example, the well-known package requests .

    Performance


    Smart imports does not affect the performance of the program, but increases the time it takes to launch.

    Due to the rebuilding of the AST, the time of the first run increases by about 1.5–2 times. For small projects this is not significant. In large projects, the start-up time suffers from the dependency structure between the modules rather than from the import time of a particular module.

    When smart imports becomes popular, I rewrite the work from AST to C - this should significantly reduce startup costs.

    To speed up loading, the results of processing AST modules can be cached on the file system. Caching is enabled in the config. Of course, the cache is disabled when you change the source.

    The start-up time is affected by both the list of module search rules and their sequence. Since some rules use standard Python functionality to search for modules. You can exclude these expenses by explicitly indicating the correspondence of names and modules using the “Custom Names” rule (see below).

    Configuration


    The default configuration has been described previously. It should be enough to work with the standard library in small projects.

    Default config
    {
        "cache_dir": null,
        "rules": [{"type": "rule_local_modules"},
                  {"type": "rule_stdlib"},
                  {"type": "rule_predefined_names"},
                  {"type": "rule_global_modules"}]
    }


    If necessary, a more complex config can be put on the file system.

    An example of a complex config (from a browser).

    During a call, the smart_import.all()library determines the position of the calling module on the file system and begins to search for the file smart_imports.jsonin the direction from the current directory to the root. If such a file is found, it is considered the configuration for the current module.

    You can use several different configs (placing them in different directories).

    There are not many configuration options now:

    {
        // Каталог для хранения кэша AST.
        // Если не указан или null — кэш не используется.
        "cache_dir": null|"string",
        // Список конфигов правил в порядке их применения.
        "rules": []
    }

    Import Rules


    The order of specifying the rules in the config determines the order of their application. The first rule that worked stops the further search for imports.

    In the examples of configs, a rule will often appear below rule_predefined_names, it is necessary that the built-in functions (for example, print) are correctly recognized .

    Rule 1: Predefined Names


    The rule allows you to ignore predefined names like __file__and built-in functions, for example print.

    Example
    # конфиг:
    # {
    #    "rules": [{"type": "rule_predefined_names"}]
    # }
    import smart_imports
    smart_imports.all()
    # мы не будем искать модуль с именем __file__
    # хотя в коде эта переменная не проинициализирована
    print(__file__)

    Rule 2: Local Modules


    Checks if there is a module with the specified name next to the current module (in the same directory). If there is, imports it.

    Example
    # конфиг:
    # {
    #    "rules": [{"type": "rule_predefined_names"},
    #              {"type": "rule_local_modules"}]
    # }
    #
    # код на файловой системе:
    #
    # my_package
    # |-- __init__.py
    # |-- a.py
    # |-- b.py
    # b.py
    import smart_imports
    smart_imports.all()
    # Будет импортирован модуль "a.py"
    print(a)

    Rule 3: Global Modules


    Tries to import a module directly by name. For example, the requests module .

    Example
    # конфиг:
    # {
    #    "rules": [{"type": "rule_predefined_names"},
    #              {"type": "rule_global_modules"}]
    # }
    #
    # ставим дополнительный пакет
    #
    # pip install requests
    import smart_imports
    smart_imports.all()
    # Будет импортирован модуль requests
    print(requests.get('http://example.com'))

    Rule 4: Custom Names


    Corresponds to the name of a particular module or its attribute. Compliance is indicated in the rule config.

    Example
    # конфиг:
    # {
    #    "rules": [{"type": "rule_predefined_names"},
    #              {"type": "rule_custom",
    #               "variables": {"my_import_module": {"module": "os.path"},
    #                             "my_import_attribute": {"module": "random", "attribute": "seed"}}}]
    # }
    import smart_imports
    smart_imports.all()
    # В примере исплользованы модули стандартной библиотеки
    # Но аналогично можно импортировать любой другой модуль
    print(my_import_module)
    print(my_import_attribute)

    Rule 5: Standard Modules


    Checks if the name is a standard library module. For example math or os.path which transforms to os_path.

    It works faster than the rule for importing global modules, as it checks for the presence of a module on a cached list. Lists for each version of Python come from here: github.com/jackmaney/python-stdlib-list

    Example
    # конфиг:
    # {
    #    "rules": [{"type": "rule_predefined_names"},
    #              {"type": "rule_stdlib"}]
    # }
    import smart_imports
    smart_imports.all()
    print(math.pi)

    Rule 6: Import by Prefix


    Imports a module by name, from the package associated with its prefix. It is convenient to use when you have several packages used throughout the code. For example, package modules utilscan be accessed with a prefix utils_.

    Example
    # конфиг:
    # {
    #    "rules": [{"type": "rule_predefined_names"},
    #              {"type": "rule_prefix",
    #               "prefixes": [{"prefix": "utils_", "module": "my_package.utils"}]}]
    # }
    #
    # код на файловой системе:
    #
    # my_package
    # |-- __init__.py
    # |-- utils
    # |-- |-- __init__
    # |-- |-- a.py
    # |-- |-- b.py
    # |-- subpackage
    # |-- |-- __init__
    # |-- |-- c.py
    # c.py
    import smart_imports
    smart_imports.all()
    print(utils_a)
    print(utils_b)

    Rule 7: The module from the parent package


    If you have subpackages of the same name in different parts of the project (for example, testsor migrations), you can allow them to search for modules to import by name in the parent packages.

    Example
    # конфиг:
    # {
    #    "rules": [{"type": "rule_predefined_names"},
    #              {"type": "rule_local_modules_from_parent",
    #               "suffixes": [".tests"]}]
    # }
    #
    # код на файловой системе:
    #
    # my_package
    # |-- __init__.py
    # |-- a.py
    # |-- tests
    # |-- |-- __init__
    # |-- |-- b.py
    # b.py
    import smart_imports
    smart_imports.all()
    print(a)

    Rule 8: Binding to another package


    For modules from a specific package, it allows the search for imports by name in other packages (specified in the config). In my case, this rule was useful for cases when I did not want to extend the work of the previous rule (Module from the parent package) to the entire project.

    Example
    # конфиг:
    # {
    #    "rules": [{"type": "rule_predefined_names"},
    #              {"type": "rule_local_modules_from_namespace",
    #               "map": {"my_package.subpackage_1": ["my_package.subpackage_2"]}}]
    # }
    #
    # код на файловой системе:
    #
    # my_package
    # |-- __init__.py
    # |-- subpackage_1
    # |-- |-- __init__
    # |-- |-- a.py
    # |-- subpackage_2
    # |-- |-- __init__
    # |-- |-- b.py
    # a.py
    import smart_imports
    smart_imports.all()
    print(b)

    Adding Your Own Rules


    Adding your own rule is pretty simple:

    1. We inherit from the class smart_imports.rules.BaseRule .
    2. We realize the necessary logic.
    3. Register a rule using the smart_imports.rules.register method
    4. Add the rule to the config.
    5. ???
    6. Profit

    An example can be found in the implementation of the current rules.

    Profit


    Multiline lists of imports at the beginning of each source have disappeared.

    The number of rows has decreased. Before the browser switched to smart imports, it had 6688 lines responsible for importing. After the transition, 2084 remained (two lines of smart_imports per file + 130 imports, called explicitly from functions and similar places).

    A nice bonus was the standardization of names in the project. Code has become easier to read and easier to write. There is no need to think about the names of the imported entities - there are some clear rules that are easy to follow.

    Development plans


    I like the idea of ​​defining code properties by variable names, so I will try to develop it both within smart imports and in other projects.

    Regarding smart imports, I plan:

    1. Add support for new versions of Python.
    2. Explore the possibility of relying on the current community practices on type annotation of code.
    3. Explore the possibility of making lazy imports.
    4. Implement utilities for automatic generation of a config from source codes and refactoring of sources for using smart_imports.
    5. Rewrite part of the C code to speed up work with the AST.
    6. To develop integration with linters and IDEs if those have problems with code analysis without explicit imports.

    In addition, I am interested in your opinion about the default behavior of the library and import rules.

    Thank you for overpowering this sheet of text :-D

    Only registered users can participate in the survey. Please come in.

    Try smart_imports in your project?

    • 5.6% already trying 9
    • 15.6% Too much magic for me 25
    • 20.6% And so everything suits me 33
    • 15.6% All issues with imports are resolved by my IDE 25
    • 5% I do not understand what is written here 8
    • 37.5% Heresy! Burn in the name of Guido! 60

    Also popular now: