Interconversions JSON, YAML, XML

    JSON, YAML are now popular, and XML technologies are considered a relic of the past.

    Let's try to use "retro technology" to work with data in the JSON and YAML format. And we will discuss the reasons for using them today.

    There is a task - to put the logic of data transformation into the application configuration, preferably in a declarative style and unified for different formats. The data can be in various text serialization formats json, yaml, xml, java properties, ini file. But at the same time, Data Lake is too heavy artillery for this. Put the data in a document-oriented or object-relational database and try to fulfill requests for the data uploaded there too over engineering for the first stage of ETL transformation.

    Jsonpathrepeats a subset of XPath, but only to JSON format. And writing a declarative query without programming will not work - there is no analogue of XQuery. Alternatively, you could use some embedded database in jvm with its declarative language query, but this is a topic for a separate publication and the original data model in json, yaml is not relational.

    Approach to data requests from JSON / YAML

    XQuery can be performed on data in the Document Object Model. How to convert data from JSON / YAML to a DOM object ... You can use camel-xmljson or json2xml . In these libraries, the data source is json only. Therefore, rush on his dom-transformation bike. This library can take Map <String, Object> as input and turn it into org.w3c.dom.Node, and there is also a reverse conversion.

    It remains to learn how to turn JSON and YAML into Map <String, Object>. For example, this can be done using the com.fasterxml.jackson.databind.ObjectMapper class from jackson .

    Turn JSON into Map:

    ObjectMapper mapper = new ObjectMapper();
    Map<String, Object> objectTree = mapper.readValue(yaml, new TypeReference<Map<String, Object>>() {});

    Turning YAML into a Map:

    ObjectMapper mapper = new ObjectMapper(new YAMLFactory());
    Map<String, Object> objectTree = mapper.readValue(yaml, new TypeReference<Map<String, Object>>() {});

    Turn Map into a Document Object Model by connecting a library to the project :

    DomTransformer toDom = new DomTransformer(new TypeAutoDetect()).transform(objectTree.size() == 1 ? objectTree : Collections.singletonMap("root", objectTree));
    Node document = toDom.translate(objectTree);

    You can use any XQuery implementation to execute queries. I like basex as a still developing open source project. We connect the org.basex: basex: jar: 9.0 dependency to the project and execute the declarative request:

    String yaml = IOUtils.toString(TranslateTest.class.getResource("/pipeline.yml").toURI(), StandardCharsets.UTF_8);
    ObjectMapper mapper = new ObjectMapper(new YAMLFactory());
    Map<String, Object> objectGraph = mapper.readValue(yaml, new TypeReference<Map<String, Object>>() {});
    Node document = new DomTransformer(new TypeAutoDetect()).transform(
                            objectGraph.size() == 1 ? objectGraph : Collections.singletonMap("root", objectGraph));
    try(QueryProcessor proc = new QueryProcessor("declare variable $extDataset external; " +
            " $extDataset//*[text()='git-repo']", new Context())) {
        proc.bind("extDataset", document);
        Value queryResult = proc.value(); // execute the query

    The results for the data from pipeline.yml

    If you need to convert DOM / XML to JSON / YAML using jackson, then transform (Node currentNode) can help with this.


    With XQuery, you can query not only XML data. What this query language still successfully manages and this "old man" will still live in java data transformation projects even in JSON and YAML formats.

    Of course, poorly structured data is not only JSON, YAML and XML. And it's too early to put an end to the processing of everything in the world ...

    I hope that the approach from the publication will help you to perform declarative queries on heterogeneous data in the application. Or you have faced a similar problem in the JVM and you have better ideas, share in the comments!

    On April 26, my colleague and I hold an open java meetup in the Moscow office. We will be happy to welcome guests!
    You can relax after a working day, learn something new, debate, have a pizza snack and chat with the developers.

    The meeting will be held on April 26, 2018 19: 00-21: 00 in the Moscow office of Aligh Technology at 9, Varshavskoye Sh. 4. Registration by reference: Amazon Web Services and JVM for server side projects.

    The program has three reports.

    Amazon web services emulation in the JVM process: speed up development, testing and save money.

    Спикер: Игорь Сухоруков

    Как эффективно разрабатывать Big Data приложения на инфраструктуре Amazon Web Services‎. Постараемся чтобы локально разрабатывать решение было просто, а интеграционные тесты работали быстро и максимально дешево для нас. Помним об экономии, пока глава Amazon — Джефф Безос запускает ракеты в Blue Origin и гуляет с роботом SpotMini от Boston Dynamics. В докладе расскажу как получилось эмулировать S3 filesystem, Redshift data warehouse, SQS queues, PostgreSQL RDS service в JVM процессе на основе open source проектов. В рамках доклада также будет сравнение популярных Big Data решений для BI аналитики.

    Код для продакшена.

    Спикер: Юрий Геиниш

    Простые, но не очевидные советы для разработчиков по созданию устойчивых и диагностируемых серверных приложений. Доклад будет сфокусирован на основных принципах, поэтому может оказаться полезным разработчикам на разных языках программирования и платформах.

    Мельдоний для Groovy: витаем в «облаках» или погружаемся в «кровавый enterprise»

    Спикер: Игорь Сухоруков

    В «облаках» с Groovy на стероидах можно сделать больше и удобнее. Поговорим о динамической загрузке классов из maven артефактов и о том как удобно запускать скрипты в Amazon Web Services‎ и Docker контейнерах. Как выжить скриптам в изолированном корпоративном окружении. И конечно же похоливарим тему зачем нужен Groovy, когда есть Kotlin.

    Also popular now: