Pdef - compiler and web interface description language

    At the beginning of last year, I had the idea to write my own interface language (IDL), which would be similar to Protobuf or Thrift, but would be intended for the web. I was hoping to finish it somewhere in about three months. A little over a year has passed before the first stable version.

    Pdef (protocol definition language) is a statically typed interface description language that supports JSON and HTTP RPC. It allows you to describe interfaces and data structures once, and then generate code for specific programming languages. Pidef is suitable for public api, internal services, distributed systems, configuration files, as a format for storing data, cache and message queues.

    The main functionality:

    • Developed system of packages, modules and namespaces.
    • Support for circular imports and type dependencies (with some limitations).
    • A simple type system based on a clear separation of interfaces and data structures.
    • Inheritance of messages (analogue of structs) and interfaces.
    • Support for call chains, for example github.user(1).repos().all().
    • JSON as a data format and HTTP RPC for data transfer.
    • Ability to use other formats and RPC.
    • Plug-in code generators (officially supported by Java , Python, and Objective-C ).
    • Optional code generation, i.e. Pidef allows you to serialize data and send requests by hand.

    Why do we need Pidef? First of all, to increase labor productivity and simplify the development and support of client-server, service-oriented and distributed code. But it also combines documentation and description of APIs and allows you to build vertically integrated systems in which the overhead of the interaction of individual components is reduced.

    Message description example:
    message Human {
        id          int64;
        name        string;
        birthday    datetime;
        sex         Sex;
        continent   ContinentName;
    }
    

    Usage examples ( generated code examples ):
    Json
    {
        "id": 1,
        "name": "Ivan Korobkov",
        "birthday": "1987-08-07T00:00Z",
        "sex": "male",
        "continent": "europe"
    }
    

    Java
    Human human = new Human()
        .setId(1)
        .setName("John")
        .setSex(Sex.MALE)
        .setContinent(ContinentName.ASIA)
    String json = human.toJson();
    Human another = Human.fromJson(json);
    

    Python
    human = Human(id=1, name="John")
    human.birthday = datetime.datetime(1900, 1, 2)
    s = human.to_json()
    another = Human.from_json(s)
    

    Objective-c
    Human *human = [[Human alloc]init];
    human.id = 1;
    human.name = @"John";
    human.sex = Sex_MALE;
    human.continent = ContinentName_EUROPE;
    NSError *error = nil;
    NSData *data = [human toJsonError:&error];
    Human *another = [Human messageWithData:data error:&error];
    


    Installation

    Pidef consists of a compiler, plug-in code generators and bindings specific to specific programming languages. The compiler and code generators are written in Python.

    Installing the compiler as a Python package with PyPI :

    pip install pdef-compiler
    

    Or you can download the archive with a specific version from the project's release page , unzip it and execute:

    python setup.py install
    

    Installation of code generators (download links are on the pages of specific languages):

    pip install pdef-java
    pip install pdef-python
    pip install pdef-objc
    

    Everything, the compiler is ready to use. You can run the following command to verify that everything is installed correctly. She will download the sample package and check it.

    pdefc -v check https://raw.github.com/pdef/pdef/master/example/world.yaml
    

    Each code generator during installation adds its own commands to the compiler, you can see them in the help:

    pdefc -h
        ...
        generate-python     Python code generator.
        generate-objc       Objective-C code generator.
        generate-java       Java code generator.
    pdefc generate-python -h
    


    Using

    Create a package file myproject.yaml:

    package:
        name: myproject
        modules:
            - posts
            - photos
    

    Create the module files:

    // Файл posts.pdef
    namespace myproject;
    import myproject.photos;
    interface Posts {
        get(id int64) Post;
        @post
        create(title string @post, text string @post) Post;
    }
    message Post {
        id      int64;
        title   string;
        text    string;
        photos  list;
    }
    

    // Файл photos.pdef
    namespace myproject;
    message Photo {
        id  int64;
        url string;
    }
    

    Run the code generation:

    pdefc generate-java myproject.yaml --out generated-java/
    pdefc generate-python myproject.yaml --out generated-python/
    pdefc generate-objc myproject.yaml --out generated-objc/
    

    Code generators support mapping of Pidef modules and namespaces to specific programming languages. You can learn more from the description of the commands.


    Pidef Guide 1.1


    Syntax


    Pidef's syntax is similar to Java / C ++ with an inverted order of types and fields / arguments. All identifiers must begin with a Latin character and contain only Latin characters, numbers and underscores. Grammar Description (BNF).

    Example:
    namespace example;
    interface MyInterface {
        method(
            arg0    int32,
            arg1    string
        ) list;
    }
    message MyMessage  {
        field0      int32;
        field1      string;
    }
    enum Number {
        ONE, TWO;
    }
    


    Comments


    There are two types of comments: single-line and multi-line for documentation. Documentation comments can be placed at the very beginning of the module, before the type definition (message, interface or enumeration) or before the method. Single-line comments are cut out during parsing, multi-line comments are saved and used by code generators for documentation.

    /**
     * This is a multi-line module docstring.
     * It is a available to the code generators.
     *
     * Start each line with a star because it is used
     * as line whitespace/text delimiter when
     * the docstring is indented (as method docstrings).
     */
     namespace example;
    // This is a one line comment, it is stripped from the source code.
    /** Interface docstring. */
    interface ExampleInterface {
        /**
         * Method docstring.
         */
        hello(name string) string;
    }
    


    Packages and Modules


    Packages


    Pidef files must be organized into packages. Each package is described by one yamlfile, which contains the name of the package and lists the modules and dependencies. Cyclic dependencies between packages are prohibited. Module names are automatically mapped to files. To do this, the points are replaced with the system directory separator and an extension is added .pdef. For example, users.eventscorresponds to a file users/events.pdef. Dependencies indicate the name of the package and the optional path to its yamlfile with a space. Dependency paths can be set and redefined when executing console commands.

    Example package file:
    package:
      # Package name
      name: example
      # Additional information
      version: 1.1
      url: https://github.com/pdef/pdef/
      author: Ivan Korobkov 
      description: Example application
      # Module files.
      modules:
        - example
        - photos
        - users
        - users.profile
      # Other packages this package depends on.
      dependencies:
        - common
        - pdef_test https://raw.github.com/pdef/pdef/1.1/test/test.yaml
    

    And its file structure (a directory apiis optional):
    api/
        example.yaml
        example.pdef
        photos.pdef
        users.pdef
        users/profile.pdef
    


    Modules and Namespaces


    A module is a separate *.pdeffile describing messages, interfaces, and enumerations. Each module immediately after the optional documentation should contain an indication of the namespace. All types in the same namespace must have unique names. Different packages may use the same namespaces.

    Namespaces in pidefa are wider than in Java / C # / C ++, and should not correspond to the structure of files and directories. For the latter, there are module names. Typically, one or more packages use the same namespace. Possible examples twitter, githubetc.

    /** Module with a namespace. */
    namespace myproject;
    message Hello {
        text    string;
    }
    


    Imports


    Imports are similar to includeother languages; they allow you to access types from another module in one module. Imports are placed immediately after the module namespace is specified. Modules are imported using the package name and file path without an extension .pdef, with a period instead of a directory delimiter. When the name of the module matches the name of the package, the module can only be imported by the name of the package.

    Individual imports:
    namespace example;
    import package;           // Equivalent to "import package.package" when package/module names match.
    import package.module;
    

    Batch imports:
    namespace example;
    from package.module import submodule0, submodule1;
    


    Loop imports and dependencies


    Loop imports are possible until the types of one module inherit the types of another module and vice versa. Otherwise, you can try to divide the modules into smaller ones or combine them into one file. Cyclic dependencies between types are allowed.

    Such restrictions are sufficient to support most programming languages. Interpreted languages ​​like Ruby or Python are also supported, as the Pidef compiler makes sure that during inheritance the modules will have a clear tree-like order of execution, in other cases the modules can be executed in any order. For more information on the implementation of circular dependencies in specific languages, see the Pdef Generated and Language-Specific Code

    Example of cyclic imports and dependencies:
    // users.pdef
    namespace example;
    from example import photos;     // Circular import.
    message User {
        bestFriend  User;           // References a declaring type.
        photo       Photo;          // References a type from another module.
    }
    

    // photos.pdef
    namespace example;
    from example import users;      // Circular import.
    message Photo {
        user    User;               // References a user from another module.
    }
    


    Name resolution


    Within the framework of one namespace, a local type name is used, for example, MyMessagewithin different ones, the full name namespace.MyMessagе.


    Type system


    Pidef has a simple static type system built on the principle of separation of interfaces and data structures.

    Void


    void it is a special type that indicates that the method does not return a result.

    Data types


    Primitive types


    • bool: boolean value (true / false)
    • int16: signed 16-bit number
    • int32: signed 32-bit number
    • int64: signed 64-bit number
    • float: 32-bit floating-point number
    • double: 64-bit floating point number
    • string: Unicode string
    • datetime: date in time without time zone

    Containers


    • list: An ordered list whose items can be any data type.
    • set: An unordered set of unique values ​​whose elements can be any data type.
    • map: key-value container, keys must be primitive types, values ​​can be any data types.

    message Containers {
        numbers     list;
        tweets      list;
        ids         set;
        colors      set;
        userNames   map;
        photos      map>;
    }
    

    Transfers


    An enumeration is a collection of unique string values. Enumerations are also used to indicate discriminators in inheritance.

    enum Sex {
        MALE, FEMALE;
    }
    enum EventType {
        USER_REGISTERED,
        USER_BANNED,
        PHOTO_UPLOADED,
        PHOTO_DELETED,
        MESSAGE_RECEIVED;
    }
    

    Messages and exceptions


    A message (analogue struct'a) is a collection of statically typed named fields. Messages support simple and polymorphic inheritance. Messages defined as exceptions ( exception) can additionally be used to indicate exceptions at interfaces.

    • All message fields must have unique names.
    • The field type must be a data type.
    • The field may indicate the message in which it is defined (self-referencing).


    /** Example message. */
    message User {
        id          int64;
        name        string;
        age         int32;
        profile     Profile;
        friends     set;  // Self-referencing.
    }
    /** Example exception. */
    exception UserNotFound {
        userId      int64;
    }
    

    Inheritance


    Inheritance allows one message to inherit the fields of another message or exception. In simple inheritance, descendants cannot be unpacked from the parent, for this there is polymorphic inheritance.

    • Cyclical inheritance is prohibited.
    • A message can have only one parent.
    • Overriding fields in descendants is prohibited.
    • The descendant and parent must be either messages or exceptions, i.e. they cannot be mixed.
    • A parent must be defined before its descendants, and also cannot be imported from dependent modules (more details in cyclic imports).

    Inheritance example:
    message EditableUser {
        name        string;
        sex         Sex;
        birthday    datetime;
    }
    message User : EditableUser {
        id              int32;
        lastSeen        datetime;
        friendsCount    int32;
        likesCount      int32;
        photosCount     int32;
    }
    message UserWithDetails : User {
       photos       list;
       friends      list;
    }
    

    Polymorphic inheritance


    Polymorphic inheritance allows you to unpack descendants based on the value of the discriminator field. A parent with all descendants is an inheritance tree. One child can inherit from another (and not just a parent), but only within the same tree.

    For polymorphic inheritance you need:

    • Create an enumeration that will serve as a set of values ​​for the discriminator.
    • Add a field with the type of this enumeration to the parent message and mark it as @discriminator.
    • Indicate the discriminator value of each of the descendants as message Subtype : Base(DiscriminatorEnum.VALUE).

    Limitations:

    • The parent and all descendants must be defined in one package.
    • The type of discriminator must be determined before the parent and cannot be imported from the dependent module.
    • In one inheritance tree there can be only one discriminator field.
    • You cannot inherit a polymorphic message without specifying a discriminator value.

    An example of polymorphic inheritance:
    /** Discriminator enum. */
    enum EventType {
        USER_EVENT,
        USER_REGISTERED,
        USER_BANNED,
        PHOTO_UPLOADED,
    }
    /** Base event with a discriminator field. */
    message Event {
        type   EventType @discriminator;    // The type field marked as @discriminator
        time   datetime;
    }
    /** Base user event. */
    message UserEvent : Event(EventType.USER_EVENT) {
        user    User;
    }
    message UserRegistered : UserEvent(EventType.USER_REGISTERED) {
        ip      string;
        browser string;
        device  string;
    }
    message UserBanned : UserEvent(EventType.USER_BANNED) {
        moderatorId int64;
        reason      string;
    }
    message PhotoUploaded : Event(EventType.PHOTO_UPLOADED) {
        photo   Photo;
        userId  int64;
    }
    


    Interfaces


    An interface is a collection of statically typed methods. Each method has a unique name, named arguments, and result. The result can be any type of data, including other interfaces.

    A method is called terminal when it returns a data type or void. A method is called an interface when it returns an interface. Serial method calls must be terminated by the terminal, for example app.users().register("John Doe").

    Terminal methods can be marked as @postto separate methods that modify data. Their arguments may also be marked as @post. HTTP RPC sends these methods as POST requests, and @postadds arguments to the body of the request.

    Terminal methods not tagged@post, may have @queryarguments that are sent as an HTTP query string.

    • Interface methods must have unique names.
    • Arguments must have unique names.
    • Arguments must be data types.
    • Only terminal methods can be marked as @post.
    • Only terminal methods not marked as @postcan have @queryarguments.
    • The last method in the call chain should be terminal.

    Interface Example:
    interface Application {
        /** Void method. */
        void0() void;
        /** Interface method. */
        service(arg int32) Service;
        /** Method with 3 args. */
        method(arg0 int32, arg1 string, arg2 list) string;
    }
    interface Service {
        /** Terminal method with @query args. */
        query(limit int32 @query, offset int32 @query) list;
        /** Terminal post method with one of args marked as @post. */
        @post
        mutator(arg0 int32, postArg string @post) string;
    }
    

    Interface Inheritance


    Interfaces can inherit other interfaces.

    • Overriding methods is prohibited.
    • The heir can have only one parent.
    • If an exception is defined for the parent, then the heirs must either not indicate the exception or indicate the exception of the parent.

    Interface inheritance example:
    interface BaseInterface {
        method() void;
    }
    interface SubInterface : BaseInterface {
        anotherMethod() void;
    }
    

    Exceptions


    Exceptions are indicated in the root interfaces with @throws(Exception). The root interface is the interface from which all calls begin. Exceptions to other interfaces in the call chain are ignored. To support multiple exceptions, polymorphic inheritance or composition is used. Usually there is one root interface, for example, Githubor Twitter, and one exception.

    Example polymorphic exceptions:
    @throws(AppException)
    interface Application {
        users() Users;
        photos() Photos;
        search() Search;
    }
    enum AppExceptionCode {
        AUTH_EXC,
        VALIDATION_EXC,
        FORBIDDEN_EXC
    }
    exception AppException {
        type AppExceptionCode @discriminator;
    }
    exception AuthExc : AppException(AppExceptionCode.AUTH_EXC) {}
    exception ValidationExc : AppException(AppExceptionCode.VALIDATION_EXC) {}
    exception ForbiddenExc : AppException(AppExceptionCode.FORBIDDEN_EXC) {}
    


    Conclusion

    Writing a draft version of the compiler was quite simple, I think it was ready somewhere in a month of work in my spare time. The rest of the year was spent on making pidef relatively simple, unambiguous, and easy to use. Generics, polymorphic inheritance with many discriminators, overriding exceptions in call chains, an open type system (which allowed you to use your own native types, like native mytype), weak typing (when the field or result of the method had a type object, and the clients themselves were supposed to be in the stable version of the language, unpack it), as well as much more. As a result, I hope we have a simple, easy-to-read and easy-to-use language.

    Why is there no full support for REST? Initially, it was planned, but the specification and functionality were already quite voluminous, so REST was replaced with a simpler implementation of HTTP RPC. In future versions, it may appear. You can read more about RPC in the specification, and see examples on the pages of specific language bindings. Links are at the end of the article.

    I would like to share my feelings about using the language from the point of view of the user, not the author. Over the past year, I have used it in several projects, in some of them even the alpha version. I like pidef. It increases the weak connectivity of components, unifies types, interfaces, and documentation, and frees programmers from routine duplication of code in different languages.

    I think, as I wrote at the beginning of the article, it greatly reduces the overhead of organizing the interaction of various systems, including mobile clients, sites, api servers, internal services, distributed systems, push notification servers, queues, data storage systems. All of them, in the end, can use the same available data types and interfaces. At the same time, there is no technological lock-in, because inside by default it is the same JSON and HTTP.

    References


    Also popular now: