JSON pipes in shell

    The more I write one-liners in the shell, the more I come to two important ideas:
    1. This is a very powerful tool for "direct programming", that is, telling the computer what to do.
    2. Most of the single-line is devoted to grep / awk / cut / tr, which somehow pick out and bring the output of previous utilities into a human form.

    Despite the fact that the pipe model is amazing, completely dirty hacks for catching the necessary fields in the output in the second paragraph (“and here we can select the one we need by the characteristic comma using awk -F, '{print $ 2}' ... ) make the procedure controversial for pleasure, and certainly unreadable.

    Another serious problem: despite the fact that the shell gives quite a lot of idioms from functional programming, it does not have an idiom of filtering the list by the result of executing an external program. That is, we can “grind” the list. But to leave in the list only those elements for which some program returned “success” - no.

    At the same time, there is a hostile and not very well written environment - powershell (Windows). In which they took a good idea (pipes do not transmit text, but objects), but ruined it with two things:
    1. Windows non-ergonomic console ( Shift-PgUp where, and? They say, Ctrl-PdUp in new versions)
    2. a suggestion to go and learn .net in order to work normally with methods.
    3. Lack of under most operating systems


    I would like to have objects in a pipe in a warm Linux Linux shell. With hand-candy (little typing), eye-candy (nice to watch) and overall ergonomic use. I also want to be able to combine the “new approach” with the old, that is, an ordinary text pipe.

    Idea


    We need to write a set of tools that will allow us to operate with structured data in pipe-style. The obvious choice is XML JSON.
    We need:
    1. Utilities that accept standard input formats and convert them to json.
    2. Utilities that allow you to manipulate json in a pipe.
    3. Utilities that will convert json to the “normal” format.

    In this case, the person will not see json on the screen, but will be able to work with it.

    For seed


    (for understanding, I will write long names of utilities, in real life these will be short abbreviations, that is, not json-get-object, but something like jgo or jg)

    Outputs only files for which file was able to determine the type:
    ls -la | ls2json | json-filter 'filename' --exec 'file {} >/dev/null' | json-print

    Pumps from some the site’s token for authorization, picks it out from json and sets it into environment variables, then downloads the list and filtering the author field for regexp pumps out all urls:
    curl mysite/api.json | env `json-get-to-env X-AUTH-TOKEN`;curl -H X-AUTH-TOKEN $X-AUTH-TOKEN mysite/api/list.json | json-filter --field 'author' --rmatch 'R.{1,2}dal\d*' | json-get --field 'url' | xargs wget

    parse the find -ls output, sort by size field, cut from array elements 10 to 20, displays them in csv.
    find . -ls | ls2josn | json-sort --field 'size' | json-slice [10:20] | json2csv

    Terminology


    inputs


    The main task is to make json-candy from messy-output. Important: have the option to handle incorrect input: a) ignore, b) stop the pipe with an error.

    Examples:
    Generic:
    • line2json - Converts regular output to an array of strings, where the string matches the string (line to string).
    • words2json - similarly, but according to "words".
    • csv2json - converts cvs to an object, allowing you to assign the specified element with a key.
    • lineparse2json - converts a string into an object, dividing it by the specified characters. Reminds awk -F: '{print $ 1, $ 2}',


    app-specific:
    • ls2json (you can either choose ls or take ls output) and structure it as an array of objects, where each object is a file with a bunch of fields. Maybe even more than ls can (regular and extended attributes of lsattr, all information about inodes, creation dates, etc.)
    • ps2json - similarly, by process lists
    • lsof2json - A list of objects that describe applications that use the file.
    • openfiles2json - a list of fd opened by the application (/ proc / PID / fd), with built-in filtering, for example, "files only", "ignore / dev / null". In objects on network sockets all information is immediately attached - ports / ip.
    • iptables2json - Print current iptables settings in json form


    As prompted in the private, mysql-json fits this idea perfectly. Run binaries on output from sql? Easily.

    File-specific:
    Read the file, output it in json.
    • syslog2json
    • ini2json
    • conf.d2json
    • sysv2json, upstart2json


    native json conversions


    The most delicious is the native json manipulations. Similarly, they should have the processing options “not json” - “ignore” / “stop”.
    • json-filter - filters objects / arrays according to specified criteria.
    • json-join - makes one of two json'ov the specified method.
    • json-sort - sorts an array of objects by the specified field
    • json-slice - cut a piece from an array
    • json-arr-get - returns an element from an array
    • json-obj-get - returns the given field / fields of the specified object
    • json-obj-add - add an object
    • json-obj-del - delete an object
    • json-obj-to-arr - prints keys or a given field of objects as an array
    • json-arr-to-obj - turns an array into an object forming a key according to a given attribute.
    • json-uniq - remove duplicate elements in an array (or print only duplicate ones)

    (add to taste and needs)

    outputs


    Bring json in human readable form:
    • json2fullpath - turn json into string notation of the form key1.key2 [3] .key4 = "foobar"
    • json2csv
    • json2lines - output an array by element per line, if inside the objects - separating them with spaces on the line.
    • json2keys - Print object keys
    • json2values ​​- Prints only object values


    iterators


    In fact, the xargs extension on json:
    • json-keys-iterate - run the specified commands for each key
    • json-values-iterate - run specified commands for each key
    • json-iterate - run specified commands for each element


    Difficulties


    Of course, it is impossible to solve the problem of processing arbitrary json by such methods - it may turn out to be too "unstructured". But firstly, the inputs make the json look predictable, and secondly, the processing of json is still more predictable than the processing of "such as the elements are separated by spaces in the shell" in the existing shell.

    Implementation


    I would write it myself, but I don’t know part of what is needed, I don’t have enough time for something. Not a programmer, I am. The secret idea of ​​the article is that “someone will write for me,” but if this is not found, then there will be at least a program article with the motivation to finish up and do it yourself.

    If someone is ready to tackle this, I will be extremely grateful. If not, I will uncover my fig python - and ideas and suggestions are welcome.

    UPDATE: It seems that people have moved a little. Your commits will be welcome here: github.com/amarao/json4shell . When it can be used, I don’t know yet. Will there be enough gunpowder - I do not know either.

    Also popular now: