NSNJSON. 道 (Final Article)

    道 is the way. In this final article about the NSNJSON format , I want to talk about my path that led me to invent this format.

    In the comments to my past articles ( “Complicated Simplified JSON” and “JSON for Brackets Lovers” ), questions were repeatedly asked about the meaning, complexity, usability and applicability of this format. So, I hasten to congratulate all concerned - you have waited!




    Content


        Introduction
        Problem number 1. Presentation of documents on hierarchical data structures
        Task No. 2. Driver Implementation
        About NSNJSON Format
        Conclusion

    Introduction


    It all started with my acquaintance with one interesting NoSQL DBMS, namely, InterSystems Caché . Company InterSystems maintains a blog on Habrahabr in which you can read about the internal structure of the database. Let me just say that the ESA uses InterSystems Caché in the GAIA project (thanks for the amendment of tsafin and morrison ).

    In order to understand the further essence, you need to know that the data in Caché is stored in globals (you can think of the global as a hierarchical data structure). More details about globals can be found in the following articles:

    1. GlobalsDB is a universal NoSQL database. Part 1 ,
    2. GlobalsDB is a universal NoSQL database. Part 2 .

    As part of my master's program, a project was launched to implement a document-oriented NoSQL based on Caché. The theoretical research base of this project is a conceptual question about the speed, performance and reliability of such an implementation, compared with the existing solution. My favorite MongoDB was chosen as the reference document-oriented NoSQL .

    So what did I have at my beginning?

    1. Document-oriented NoSQL MongoDB.
    2. Hierarchical NoSQL Caché.


    Task number 1. Representation of documents on hierarchical data structures


    It would seem that JSON documents can therefore be called hierarchical, and therefore there is no difficulty in this task. However, the devil is in the details , as it turned out, globals can store a value, either binary, or string or numeric. Lists are still supported. In other words, a leaf of the tree (global value) can contain either a number or a string or binary data or a list. I was at the very beginning of my journey and therefore refused to store JSON in binary form, preferring strings and numbers. In turn, such a solution required to figure out how to save JSON data with only strings and numbers at its disposal.

    Let me remind you that the JSON format defines 6 types: null , number , string , true ,false , array , object . Thus, it was necessary to come up with a unique data representation scheme for each JSON type, the name was only available to build hierarchies and use them as values ​​for tree sheets, strings and numbers. However, one more requirement was put forward for the scheme of representation of JSON data - uniqueness. This condition is a guarantee of unambiguous recoverability of JSON data from globals.

    I will explain this moment a little.

    Consider the JSON type null . A simple scheme could be proposed for him. Suppose if there are values ​​of type JSON null, then when saving this value to the global, you need to create a sheet and set the empty string "" as the value. However, the very first counterexample arrives very quickly - the scheme loses its ambiguity at the moment when we need to save a JSON string value equal to an empty string. In this regard, I decided to switch to another, rather simple scheme.

    circuit description
    JSON null


    JSON number ( value )


    For example, for value
    2015
    the representation would be:


    JSON string (value value )


    For example, for value value
    "R&D"
    the view would be:


    JSON true


    JSON false


    JSON array


    For example, for an array
    [ 2015, "R&D" ]
    the view would be:


    JSON object


    For example, for an object

    { "year": 2015, "section": "R&D" }

    The view would be as follows:



    We can assume that this scheme became the progenitor of the NSNJSON format.

    Now that the JSON representation scheme is ready, the next step in my path awaits me. I needed to develop a driver for this document-oriented NoSQL DBMS.


    Problem number 2. Driver implementation


    The implementation of the driver consists of two stages:

    1. development of stored code (in Caché, in the Caché ObjectScript language),
    2. developing code that will access the Caché driver and transfer data to the stored code.

    To transfer data between my driver and the Caché driver, I chose a simple format - string. My driver received JSON, converted it to a string and passed it to the Caché driver, which in turn passed this string to the stored code. The stored code parsed this line, and then applied the rules for representing JSON data on globals.

    However, a surprise awaited me!

    During development and debugging, I found out some interesting facts about the JSON parser I used in Caché stored code:

    • JSON type null translates to "" ,
    • JSON type true translates to 1 ,
    • JSON type false translates to 0 .
    • Fields starting with _ are ignored.

    Thus, I needed to solve the following problems:

    • storing type information,
    • Saving fields starting with _ .

    The solution to these problems was the NSNJSON format I developed .

    So, for null values , the following representation was proposed:

    
    { "t": "null" }
    

    For values ​​of type true , the following representation was offered:

    
    { "t": "boolean", "v": 1 }
    

    For false values , the following representation was proposed:

    
    { "t": "boolean", "v": 0 }
    

    The problem with _ was solved by introducing the “n” field .

    So, for the _id field with a value of 213, the view would be like this:

    
    { "n": "_id", "t": "number", "v": 213 }
    

    Thus, this format solved all the previously mentioned problems.


    About NSNJSON format


    I decided to separate the format into a separate project and name it NSNJSON ( N ot S o N ormal JSON ). And then I decided to share this fomat with the respected Habrahabr community in my article “Complicated Simplified JSON” , as well as, in my opinion, interesting modification of this format, in which JSON data is represented using numbers, strings and arrays, in the article “ JSON for brackets lovers .

    The NSNJSON project is published on GitHub .

    Two drivers are implemented for it:

    1. NSNJSON Node.js Driver
    2. NSNJSON Java Driver


    Conclusion


    So the final article about NSNJSON came to an end. I talked about the difficulties that I encountered, as well as how I overcame them.
    Lastly, I want to once again note that it was my 道 (path), and I went in exactly that way. At each step, one could go differently, choosing a different solution to the problems that arose, but that would not be my way ...

    Also popular now: