edn: extensible data notation

In this article I want to talk about edn. edn is a data format derived from clojure. It is similar to JSON, but provides some features not found in JSON. The features of edn are described below. Example for seed:

{:name "edn"
 :implementations #{"clojure""java""ruby""python""c""javascript""haskell""erlang"}
 :related "clojure"
 :encoding :UTF-8}


Appearance

The history of edn is similar to the appearance of JSON: first, a programming language appeared, and then a subset was extracted from it and began to be used as a data format. If for JSON, the progenitor language is JavaScript, then for edn it is Clojure.

As I said earlier, edn and JSON are very similar and considering that JSON is now the most famous, simple and popular data format, I will talk about edn through its differences from JSON.

Edn supports all simple types present in JSON: strings, numbers, boolean values. There are also new ones:

nil

Edn uses a value as a null value nil. In JSON is used null.

characters

edn supports characters for specifying individual characters. They start with a backslash: \e, \d, \n. You can use the format of UTF8: \u2603. Special characters are written in full: \newline, \return, \space, \tab.
In JSON, individual characters are usually represented as a string of length 1.
I do not call characters characters, because edn has a separate type of symbol, which will be described below.

key characters

It’s hard for me to formulate what key symbols are. I would say that it is a mixture of enumeration lines. They are conveniently used when there is a finite fixed set of possible values. These values ​​can be set with keywords. It is also customary to use key characters as display keys. Key symbols start with a colon: :name, :lastname, :female, :green. Those who worked with rubies should recognize characters in them; similar types are present in other languages, for example common lisp.

An example of using keywords in display and comparison with the JSON version:

ednJson
{:name"Jack":lastname"Brown":gender:male}

{"name": "Jack",
 "lastname": "Brown",
 "gender": "male"}



the numbers

edn divides 2 types of numbers: integers and real. It also supports arbitrary length numbers using the suffix Nfor integers and Mfor real:

[12345678891231231231232133N, 123.123123123123123213213M]


of vector

The edson JSON array is called a vector: a sequence of values ​​for which a random access operation is supported. Unlike JSON, elements should not be separated by commas. They can be omitted:

[123"Hello""World"]


display (maps)

In JSON, they are called objects. In edn, the key to a value is not separated by a colon (the colon is an error). Commas can also be omitted. The keys can be any other type, for example, numbers or keywords:

{:name"Jack":lastname"Brown":gender:male4254}


Here you specify the display with the keys :name, :lastname, :genderand 42, respectively, the values of "Jack", "Brown", :male, 54.

many

edn supports the data type set. It is set in the format #{val1 val2 val3}. The order in the set is not important and the parsers do not guarantee any specific order. In principle, parsers should convert to a standard type for a programming language, for example, HashSetfor java, PersistentHashSetfor clojure, and similarly for other languages. And in these data types, no order is guaranteed. Example: set a very useful display containing seasons and 3 colors:

{:seasons#{:winter :spring :summer :autumn}:colors#{[255 0 0] [0 255 0] [0 0 255]}}


lists

edn supports lists in addition to the vector. The list differs from the vector in that there is no random access. Although this is already the parser's business, what real type it will convert the list to. It’s generally difficult to think of when it might be more convenient to use a list rather than a vector. So the vector in the vast majority of cases is used. List example:

(12345)


symbols

In clojure, characters are used to denote variables. Those. similar to identifiers in common programming languages: a, b, i, hello, persons. A symbol of several words is usually separated by a hyphen: prime-numbers, visited-nodes. They may contain other than numbers and letters of the following characters: . * + ! - _ ? $ % & =. It's hard for me to think of a way to use characters in edn when there are lines and key characters. It already depends on your imagination. In datomic, they are used to send requests, for example:

[:find?x:where [?x:foo]] 
?x - symbol.

tagged elements

edn supports the ability to extend with tags. Tag - ID that begins with #, for which there is a symbol: #inst, #uuid, #myapp/Person. The peculiarity of such elements is that the parser, when it encounters such an element, reads it and the element following it, passes it to a special handler, which must convert the input arguments to the desired type and return it. Examples:

#myapp/Person {:first "Fred" :last "Mertz"}

Here, the tag handler must be registered in the parser #myapp/Person, which accepts the mapping and converts it into a class object myapp.Person(if there are classes in the language), or something like that.

#inst "1985-04-12T23:20:50.52Z"

The handler receives the string in RFC-3339 format and converts it to the corresponding date.

#uuid "f81d4fae-7dec-11d0-a765-00a0c91e6bf6"

The handler converts the UUID string to the corresponding object.

The last 2 tags are inline and should work out of the box. There is also a limitation that user tags should always have a namespace at the beginning, as is the case with #myapp/Person: here myapp- namespace. Tags without namespace (for example #inst, #uuid) are reserved for standard handlers.

comments

For comments used ;. Using it, you can comment out the line:
{
 :red [25500] ; Red 255, green 0, blue 0:orange [2551270] ; Red 255, green 127, blue 0
}

A more complete example of edn

Here is an example of a list of all users who have visited the resource over the past couple of days. An example is contrived and its purpose is to demonstrate edn again:

[{:name"Jack":lastname"Brown":roles#{:admin :operator}:last-visited#inst "2013-04-12T23:20:50.52Z":id#uuid "f81d4fae-7dec-11d0-a765-00a0c91e6bf6"}
 {:name"John":lastname"Black":roles#{:user}:last-visited#inst "2013-04-12T20:20:50.52Z":id#uuid "b371b600-b175-11e2-9e96-0800200c9a66"}]


When to use

This is a subjective question. All of the above features can be implemented in JSON by introducing your own constructions, but this requires more complex logic to convert to / from JSON. In edn, they are out of the box, which is very convenient. If you work with clojure, edn is the natural choice. Also, maybe you are tired of boring JSON and want to work with a more flexible and custom format, which tags can help. Having a “standard” type for a date is also a nice feature. We can say that edn is JSON on steroids.

References

Official format description: github.com/edn-format/edn
Implementations for java, ruby, python, javascript, haskell and other languages: github.com/edn-format/edn/wiki/Implementations
Discussion on Hacker News: news.ycombinator. com / item? id = 4487462 They just swear about "Why edn if there is JSON?"

Also popular now: