data:image/s3,"s3://crabby-images/1e8f9/1e8f91389e7a31417ccdb9a168cb1904ef2a2087" alt=""
NSNJSON. 道 (Final Article)
道 is the way. In this final article about the NSNJSON format , I want to talk about my path that led me to invent this format.
In the comments to my past articles ( “Complicated Simplified JSON” and “JSON for Brackets Lovers” ), questions were repeatedly asked about the meaning, complexity, usability and applicability of this format. So, I hasten to congratulate all concerned - you have waited!
data:image/s3,"s3://crabby-images/1748e/1748e824ae15e2cbd86cecafe9ceced421c4f01a" alt=""
Introduction
Problem number 1. Presentation of documents on hierarchical data structures
Task No. 2. Driver Implementation
About NSNJSON Format
Conclusion
It all started with my acquaintance with one interesting NoSQL DBMS, namely, InterSystems Caché . Company InterSystems maintains a blog on Habrahabr in which you can read about the internal structure of the database. Let me just say that the ESA uses InterSystems Caché in the GAIA project (thanks for the amendment of tsafin and morrison ).
In order to understand the further essence, you need to know that the data in Caché is stored in globals (you can think of the global as a hierarchical data structure). More details about globals can be found in the following articles:
As part of my master's program, a project was launched to implement a document-oriented NoSQL based on Caché. The theoretical research base of this project is a conceptual question about the speed, performance and reliability of such an implementation, compared with the existing solution. My favorite MongoDB was chosen as the reference document-oriented NoSQL .
So what did I have at my beginning?
1. Document-oriented NoSQL MongoDB.
2. Hierarchical NoSQL Caché.
It would seem that JSON documents can therefore be called hierarchical, and therefore there is no difficulty in this task. However, thedevil is in the details , as it turned out, globals can store a value, either binary, or string or numeric. Lists are still supported. In other words, a leaf of the tree (global value) can contain either a number or a string or binary data or a list. I was at the very beginning of my journey and therefore refused to store JSON in binary form, preferring strings and numbers. In turn, such a solution required to figure out how to save JSON data with only strings and numbers at its disposal.
Let me remind you that the JSON format defines 6 types: null , number , string , true ,false , array , object . Thus, it was necessary to come up with a unique data representation scheme for each JSON type, the name was only available to build hierarchies and use them as values for tree sheets, strings and numbers. However, one more requirement was put forward for the scheme of representation of JSON data - uniqueness. This condition is a guarantee of unambiguous recoverability of JSON data from globals.
I will explain this moment a little.
Consider the JSON type null . A simple scheme could be proposed for him. Suppose if there are values of type JSON null, then when saving this value to the global, you need to create a sheet and set the empty string "" as the value. However, the very first counterexample arrives very quickly - the scheme loses its ambiguity at the moment when we need to save a JSON string value equal to an empty string. In this regard, I decided to switch to another, rather simple scheme.
We can assume that this scheme became the progenitor of the NSNJSON format.
Now that the JSON representation scheme is ready, the next step in my path awaits me. I needed to develop a driver for this document-oriented NoSQL DBMS.
The implementation of the driver consists of two stages:
To transfer data between my driver and the Caché driver, I chose a simple format - string. My driver received JSON, converted it to a string and passed it to the Caché driver, which in turn passed this string to the stored code. The stored code parsed this line, and then applied the rules for representing JSON data on globals.
However, a surprise awaited me!
During development and debugging, I found out some interesting facts about the JSON parser I used in Caché stored code:
Thus, I needed to solve the following problems:
The solution to these problems was the NSNJSON format I developed .
So, for null values , the following representation was proposed:
For values of type true , the following representation was offered:
For false values , the following representation was proposed:
The problem with _ was solved by introducing the “n” field .
So, for the _id field with a value of 213, the view would be like this:
Thus, this format solved all the previously mentioned problems.
I decided to separate the format into a separate project and name it NSNJSON ( N ot S o N ormal JSON ). And then I decided to share this fomat with the respected Habrahabr community in my article “Complicated Simplified JSON” , as well as, in my opinion, interesting modification of this format, in which JSON data is represented using numbers, strings and arrays, in the article “ JSON for brackets lovers . ”
The NSNJSON project is published on GitHub .
Two drivers are implemented for it:
So the final article about NSNJSON came to an end. I talked about the difficulties that I encountered, as well as how I overcame them.
Lastly, I want to once again note that it was my 道 (path), and I went in exactly that way. At each step, one could go differently, choosing a different solution to the problems that arose, but that would not be my way ...
In the comments to my past articles ( “Complicated Simplified JSON” and “JSON for Brackets Lovers” ), questions were repeatedly asked about the meaning, complexity, usability and applicability of this format. So, I hasten to congratulate all concerned - you have waited!
data:image/s3,"s3://crabby-images/1748e/1748e824ae15e2cbd86cecafe9ceced421c4f01a" alt=""
Content
Introduction
Problem number 1. Presentation of documents on hierarchical data structures
Task No. 2. Driver Implementation
About NSNJSON Format
Conclusion
↑ Introduction
It all started with my acquaintance with one interesting NoSQL DBMS, namely, InterSystems Caché . Company InterSystems maintains a blog on Habrahabr in which you can read about the internal structure of the database. Let me just say that the ESA uses InterSystems Caché in the GAIA project (thanks for the amendment of tsafin and morrison ).
In order to understand the further essence, you need to know that the data in Caché is stored in globals (you can think of the global as a hierarchical data structure). More details about globals can be found in the following articles:
As part of my master's program, a project was launched to implement a document-oriented NoSQL based on Caché. The theoretical research base of this project is a conceptual question about the speed, performance and reliability of such an implementation, compared with the existing solution. My favorite MongoDB was chosen as the reference document-oriented NoSQL .
So what did I have at my beginning?
1. Document-oriented NoSQL MongoDB.
2. Hierarchical NoSQL Caché.
↑ Task number 1. Representation of documents on hierarchical data structures
It would seem that JSON documents can therefore be called hierarchical, and therefore there is no difficulty in this task. However, the
Let me remind you that the JSON format defines 6 types: null , number , string , true ,false , array , object . Thus, it was necessary to come up with a unique data representation scheme for each JSON type, the name was only available to build hierarchies and use them as values for tree sheets, strings and numbers. However, one more requirement was put forward for the scheme of representation of JSON data - uniqueness. This condition is a guarantee of unambiguous recoverability of JSON data from globals.
I will explain this moment a little.
Consider the JSON type null . A simple scheme could be proposed for him. Suppose if there are values of type JSON null, then when saving this value to the global, you need to create a sheet and set the empty string "" as the value. However, the very first counterexample arrives very quickly - the scheme loses its ambiguity at the moment when we need to save a JSON string value equal to an empty string. In this regard, I decided to switch to another, rather simple scheme.
circuit description
JSON null
data:image/s3,"s3://crabby-images/0f14a/0f14a514bb29df68e8ba2ea2b92b14108158afcd" alt=""
JSON number ( value )
data:image/s3,"s3://crabby-images/5846b/5846b02bdb79ff7d6b3c414740310b1c0fdce6fe" alt=""
For example, for value
data:image/s3,"s3://crabby-images/94b55/94b55cfcc82cd113f3e35199d8fea1c9206db149" alt=""
JSON string (value value )
data:image/s3,"s3://crabby-images/1635f/1635ff0b99841299ca1fd71bc222f5d046508be0" alt=""
For example, for value value
data:image/s3,"s3://crabby-images/9c1a4/9c1a476e0b2812ed15d4ff3b52e2b09367132d3c" alt=""
JSON true
data:image/s3,"s3://crabby-images/674f9/674f9190b7292b3a111ea2334c7e3db121eaae01" alt=""
JSON false
data:image/s3,"s3://crabby-images/b7bfa/b7bfa62867d18e5f2ea890b2289b573585960d0a" alt=""
JSON array
data:image/s3,"s3://crabby-images/bbc1e/bbc1e35ec4d66d0534dd55d777c6f413ea1f12d8" alt=""
For example, for an array
data:image/s3,"s3://crabby-images/7f41e/7f41e45ac680eed6424f2006443b6c6840face34" alt=""
JSON object
data:image/s3,"s3://crabby-images/18f5b/18f5beed79fa490823ac8bcbe12dacc22e088554" alt=""
For example, for an object
The view would be as follows:
data:image/s3,"s3://crabby-images/40424/40424ba0c64b186ed618b7e77040d7ee0dd8516d" alt=""
data:image/s3,"s3://crabby-images/0f14a/0f14a514bb29df68e8ba2ea2b92b14108158afcd" alt=""
JSON number ( value )
data:image/s3,"s3://crabby-images/5846b/5846b02bdb79ff7d6b3c414740310b1c0fdce6fe" alt=""
For example, for value
2015
the representation would be: data:image/s3,"s3://crabby-images/94b55/94b55cfcc82cd113f3e35199d8fea1c9206db149" alt=""
JSON string (value value )
data:image/s3,"s3://crabby-images/1635f/1635ff0b99841299ca1fd71bc222f5d046508be0" alt=""
For example, for value value
"R&D"
the view would be: data:image/s3,"s3://crabby-images/9c1a4/9c1a476e0b2812ed15d4ff3b52e2b09367132d3c" alt=""
JSON true
data:image/s3,"s3://crabby-images/674f9/674f9190b7292b3a111ea2334c7e3db121eaae01" alt=""
JSON false
data:image/s3,"s3://crabby-images/b7bfa/b7bfa62867d18e5f2ea890b2289b573585960d0a" alt=""
JSON array
data:image/s3,"s3://crabby-images/bbc1e/bbc1e35ec4d66d0534dd55d777c6f413ea1f12d8" alt=""
For example, for an array
[ 2015, "R&D" ]
the view would be: data:image/s3,"s3://crabby-images/7f41e/7f41e45ac680eed6424f2006443b6c6840face34" alt=""
JSON object
data:image/s3,"s3://crabby-images/18f5b/18f5beed79fa490823ac8bcbe12dacc22e088554" alt=""
For example, for an object
{ "year": 2015, "section": "R&D" }
The view would be as follows:
data:image/s3,"s3://crabby-images/40424/40424ba0c64b186ed618b7e77040d7ee0dd8516d" alt=""
We can assume that this scheme became the progenitor of the NSNJSON format.
Now that the JSON representation scheme is ready, the next step in my path awaits me. I needed to develop a driver for this document-oriented NoSQL DBMS.
↑ Problem number 2. Driver implementation
The implementation of the driver consists of two stages:
- development of stored code (in Caché, in the Caché ObjectScript language),
- developing code that will access the Caché driver and transfer data to the stored code.
To transfer data between my driver and the Caché driver, I chose a simple format - string. My driver received JSON, converted it to a string and passed it to the Caché driver, which in turn passed this string to the stored code. The stored code parsed this line, and then applied the rules for representing JSON data on globals.
However, a surprise awaited me!
During development and debugging, I found out some interesting facts about the JSON parser I used in Caché stored code:
- JSON type null translates to "" ,
- JSON type true translates to 1 ,
- JSON type false translates to 0 .
- Fields starting with _ are ignored.
Thus, I needed to solve the following problems:
- storing type information,
- Saving fields starting with _ .
The solution to these problems was the NSNJSON format I developed .
So, for null values , the following representation was proposed:
{ "t": "null" }
For values of type true , the following representation was offered:
{ "t": "boolean", "v": 1 }
For false values , the following representation was proposed:
{ "t": "boolean", "v": 0 }
The problem with _ was solved by introducing the “n” field .
So, for the _id field with a value of 213, the view would be like this:
{ "n": "_id", "t": "number", "v": 213 }
Thus, this format solved all the previously mentioned problems.
↑ About NSNJSON format
I decided to separate the format into a separate project and name it NSNJSON ( N ot S o N ormal JSON ). And then I decided to share this fomat with the respected Habrahabr community in my article “Complicated Simplified JSON” , as well as, in my opinion, interesting modification of this format, in which JSON data is represented using numbers, strings and arrays, in the article “ JSON for brackets lovers . ”
The NSNJSON project is published on GitHub .
Two drivers are implemented for it:
↑ Conclusion
So the final article about NSNJSON came to an end. I talked about the difficulties that I encountered, as well as how I overcame them.
Lastly, I want to once again note that it was my 道 (path), and I went in exactly that way. At each step, one could go differently, choosing a different solution to the problems that arose, but that would not be my way ...