igor_suhorukov August 30, 2017 at 14:00

Source Ripper and AST Tree Spring Boot

Recently I came across an unusual task of accessing code comments in runtime.

This will help to catch 3 birds with one stone - in addition to the documentation in the project code, it will be easy to generate sequence diagrams from the project tests that analysts can read, and QA will be able to compare their test plans with the project’s auto-tests and add them if necessary. A common language appears in the team between those who write the code and those who cannot read it. As a result, everyone has a better understanding of the project during the development process, and from the developer's point of view, you don’t need to draw anything by hand - the code and tests are primary. More likely that such documentation will be the most relevant on the project, as it is generated from working code. At the same time, it disciplines the developer to document the classes and methods that participate in diagrams.

In this publication I will tell how to extract javadoc from the source code of the project.

Of course, before writing my code, I remembered that the best code is already written and tested by the community. I started looking for what exists to work with javadoc in runtime and how convenient it would be to use for my task. Searches led to therapi-runtime-javadoc project
. By the way, the project is alive and developing and allows you to work in runtime with comments from the class sources. The library works like AnnotationProcessor at compilation and it is quite convenient. But there is one feature that does not allow using it without fear with real code that will go into real operation in the future - that it modifies the source bytecode of the classes, adding meta information from the comments to it. It is also necessary to recompile the code and add the @RetainJavadoc annotation, which will not work for project dependencies. It is a pity that the solution seemed perfect at first glance.

It was also important for me to hear an opinion from the outside. Having talked with a rather peppy developer and listening to his thoughts, he seemed to be solving this problem, he suggested parsing HTML javadoc. This will work well since the central maven repository has javadoc archives for artifacts, but for me it’s not a very elegant decision to gut the generated documentation when there is source code. Although it’s a matter of taste ...

It seems more appropriate to me to extract the documentation from the source code, besides AST has much more information available than in the HTML documentation based on the same source code. There was experience and preparations for this approach, about which I once talked about in the publication “Parsing a Java program using a java program”.

Thus, the extract-javadoc project was born, which is available as a finished build in maven central com.github.igor-suhorukov: extract-javadoc: 1.0 .

The javadoc ripper “under the hood”

If we discard the unremarkable parts of the program for working with the file system, saving javadoc as a JSON file, parallelizing parsing and working with the contents of jar and zip archives, then the stuffing of the project begins in the parseFile method of the com.github.igorsuhorukov.javadoc.ExtractJavadocModel class.

Initializing the ECJ parser of java files and extracting javadoc looks like this:

public static List parseFile(String javaSourceText, String fileName, String relativePath) {
        ASTParser parser = parserCache.get();
        parser.setSource(javaSourceText.toCharArray());
        parser.setResolveBindings(true);
        parser.setEnvironment(new String[]{}, SOURCE_PATH, SOURCE_ENCODING, true);
        parser.setKind(ASTParser.K_COMPILATION_UNIT);
        parser.setCompilerOptions(JavaCore.getOptions());
        parser.setUnitName(fileName);
        CompilationUnit cu = (CompilationUnit) parser.createAST(null);
        JavadocVisitor visitor = new JavadocVisitor(fileName, relativePath, javaSourceText);
        cu.accept(visitor);
        return visitor.getJavaDocs();
}

The main work on parsing javadoc and mapping it to an internal model, which is later serialized in JSON, happens in JavadocVisitor:

package com.github.igorsuhorukov.javadoc.parser;
import com.github.igorsuhorukov.javadoc.model.*;
import com.github.igorsuhorukov.javadoc.model.Type;
import org.eclipse.jdt.core.dom.*;
import java.util.ArrayList;
import java.util.List;
import java.util.Objects;
import java.util.Optional;
import java.util.stream.Collectors;
public class JavadocVisitor extends ASTVisitor {
    private String file;
    private String relativePath;
    private String sourceText;
    private CompilationUnit compilationUnit;
    private String packageName;
    private List commentList;
    private List javaDocs = new ArrayList<>();
    public JavadocVisitor(String file, String relativePath, String sourceText) {
        this.file = file;
        this.relativePath = relativePath;
        this.sourceText = sourceText;
    }
    @Override
    public boolean visit(PackageDeclaration node) {
        packageName = node.getName().getFullyQualifiedName();
        javaDocs.addAll(getTypes().stream().map(astTypeNode -> {
            JavaDoc javaDoc = getJavaDoc(astTypeNode);
            Type type = getType(astTypeNode);
            type.setUnitInfo(getUnitInfo());
            javaDoc.setSourcePoint(type);
            return javaDoc;
        }).collect(Collectors.toList()));
        javaDocs.addAll(getMethods().stream().map(astMethodNode -> {
            JavaDoc javaDoc = getJavaDoc(astMethodNode);
            Method method = new Method();
            method.setUnitInfo(getUnitInfo());
            method.setName(astMethodNode.getName().getFullyQualifiedName());
            method.setConstructor(astMethodNode.isConstructor());
            fillMethodDeclaration(astMethodNode, method);
            Type type = getType((AbstractTypeDeclaration) astMethodNode.getParent());
            method.setType(type);
            javaDoc.setSourcePoint(method);
            return javaDoc;
        }).collect(Collectors.toList()));
        return super.visit(node);
    }
    private CompilationUnitInfo getUnitInfo() {
        return new CompilationUnitInfo(packageName, relativePath, file);
    }
    @SuppressWarnings("unchecked")
    private void fillMethodDeclaration(MethodDeclaration methodAstNode, Method method) {
        List parameters = methodAstNode.parameters();
        org.eclipse.jdt.core.dom.Type returnType2 = methodAstNode.getReturnType2();
        method.setParams(parameters.stream().map(param -> param.getType().toString()).collect(Collectors.toList()));
        if(returnType2!=null) {
            method.setReturnType(returnType2.toString());
        }
    }
    private Type getType(AbstractTypeDeclaration astNode) {
        String binaryName = astNode.resolveBinding().getBinaryName();
        Type  type = new Type();
        type.setName(binaryName);
        return type;
    }
    @SuppressWarnings("unchecked")
    private JavaDoc getJavaDoc(BodyDeclaration astNode) {
        JavaDoc javaDoc = new JavaDoc();
        Javadoc javadoc = astNode.getJavadoc();
        List tags = javadoc.tags();
        Optional comment = tags.stream().filter(tag -> tag.getTagName() == null).findFirst();
        comment.ifPresent(tagElement -> javaDoc.setComment(tagElement.toString().replace("\n *","").trim()));
        List fragments = tags.stream().filter(tag -> tag.getTagName() != null).map(tag-> {
            Tag tagResult = new Tag();
            tagResult.setName(tag.getTagName());
            tagResult.setFragments(getTags(tag.fragments()));
            return tagResult;
        }).collect(Collectors.toList());
        javaDoc.setTags(fragments);
        return javaDoc;
    }
    @SuppressWarnings("unchecked")
    private List getTags(List fragments){
        return ((List)fragments).stream().map(Objects::toString).collect(Collectors.toList());
    }
    private List getTypes() {
        return commentList.stream().map(ASTNode::getParent).filter(Objects::nonNull).filter(AbstractTypeDeclaration.class::isInstance).map(astNode -> (AbstractTypeDeclaration) astNode).collect(Collectors.toList());
    }
    private List getMethods() {
        return commentList.stream().map(ASTNode::getParent).filter(Objects::nonNull).filter(MethodDeclaration.class::isInstance).map(astNode -> (MethodDeclaration) astNode).collect(Collectors.toList());
    }
    @Override
    @SuppressWarnings("unchecked")
    public boolean visit(CompilationUnit node) {
        commentList = node.getCommentList();
        this.compilationUnit = node;
        return super.visit(node);
    }
    public List getJavaDocs() {
        return javaDocs;
    }
}

The com.github.igorsuhorukov.javadoc.parser.JavadocVisitor # visit (PackageDeclaration) method currently handles javadoc only for types and their methods. I need this information to build a sequence of diagrams with comments.

Working with the AST program for the task of extracting documentation from the sources was not as complicated as it seemed at the beginning. And I was able to develop a more or less universal solution while I have a break between work and I rest, coding for a couple of days for 3-4 hours at a time.

How to extract javadoc in a real project

For a maven project, it's easy to add javadoc extraction to all project modules by adding

in the parent pom.xml of the project the following profile

extract-javadoc${basedir}/src/main/javaorg.codehaus.mojoexec-maven-plugin1.6.0extract-javadocpackagejavatruetruecom.github.igorsuhorukov.javadoc.ExtractJavadocModel${project.basedir}/src${project.build.directory}/javadoc.json.xzcom.github.igor-suhorukovextract-javadoc1.0jarcompileorg.codehaus.mojobuild-helper-maven-plugin3.0.0attach-extracted-javadocpackageattach-artifact${project.build.directory}/javadoc.json.xzxzjavadoc

Так в проекте появляется дополнительный артефакт, который содержит javadoc в формате json и он попадает в репозитарий при выполнении install/deploy.

Также не должно быть проблемой интегрировать это решение в сборку Gradle, так как это обычное консольное приложение на вход которому передаются два параметра — путь к исходникам и файл куда записывается javadoc в JSON формате или компрессированном json, если путь заканчивается на ".xz"

Подопытным кроликом сейчас станет проект Spring Boot как достаточно большой проект с отличной javadoc документацией.

Выполним команду:

git clone https://github.com/spring-projects/spring-boot.git

И добавим в файл spring-boot-parent/pom.xml в тег profiles,

наш тег profile

extract-javadoc${basedir}/src/main/javaorg.codehaus.mojoexec-maven-plugin1.6.0extract-javadocpackagejavatruetruecom.github.igorsuhorukov.javadoc.ExtractJavadocModel${project.basedir}/src${project.build.directory}/javadoc.jsoncom.github.igor-suhorukovextract-javadoc1.0jarcompileorg.codehaus.mojobuild-helper-maven-plugin3.0.0attach-extracted-javadocpackageattach-artifact${project.build.directory}/javadoc.jsonjsonjavadoc

After that, we will build the project, in the process for all java files from Spring Boot, the AST tree is built and the javadoc types and methods are extracted . The javadoc.json file will appear in the target directories of the modules containing java sources. But the more processor cores on your system, the more memory will be required for parsing, so it may be necessary to increase the max heap size in the .mvn / jvm.config file.

As an example, the file spring-boot-tools / spring-boot-antlib / target is created /javadoc.json

[ {
  "comment" : "Ant task to find a main class.",
  "tags" : [ {
    "name" : "@author",
    "fragments" : [ " Matt Benson" ]
  }, {
    "name" : "@since",
    "fragments" : [ " 1.3.0" ]
  } ],
  "sourcePoint" : {
    "@type" : "Type",
    "unitInfo" : {
      "packageName" : "org.springframework.boot.ant",
      "relativePath" : "main/java/org/springframework/boot/ant",
      "file" : "FindMainClass.java"
    },
    "name" : "org.springframework.boot.ant.FindMainClass"
  }
}, {
  "comment" : "Set the main class, which will cause the search to be bypassed.",
  "tags" : [ {
    "name" : "@param",
    "fragments" : [ "mainClass", " the main class name" ]
  } ],
  "sourcePoint" : {
    "@type" : "Method",
    "unitInfo" : {
      "packageName" : "org.springframework.boot.ant",
      "relativePath" : "main/java/org/springframework/boot/ant",
      "file" : "FindMainClass.java"
    },
    "type" : {
      "@type" : "Type",
      "unitInfo" : null,
      "name" : "org.springframework.boot.ant.FindMainClass"
    },
    "name" : "setMainClass",
    "constructor" : false,
    "params" : [ "String" ],
    "returnType" : "void"
  }
}, {
  "comment" : "Set the root location of classes to be searched.",
  "tags" : [ {
    "name" : "@param",
    "fragments" : [ "classesRoot", " the root location" ]
  } ],
  "sourcePoint" : {
    "@type" : "Method",
    "unitInfo" : {
      "packageName" : "org.springframework.boot.ant",
      "relativePath" : "main/java/org/springframework/boot/ant",
      "file" : "FindMainClass.java"
    },
    "type" : {
      "@type" : "Type",
      "unitInfo" : null,
      "name" : "org.springframework.boot.ant.FindMainClass"
    },
    "name" : "setClassesRoot",
    "constructor" : false,
    "params" : [ "File" ],
    "returnType" : "void"
  }
}, {
  "comment" : "Set the ANT property to set (if left unset, result will be printed to the log).",
  "tags" : [ {
    "name" : "@param",
    "fragments" : [ "property", " the ANT property to set" ]
  } ],
  "sourcePoint" : {
    "@type" : "Method",
    "unitInfo" : {
      "packageName" : "org.springframework.boot.ant",
      "relativePath" : "main/java/org/springframework/boot/ant",
      "file" : "FindMainClass.java"
    },
    "type" : {
      "@type" : "Type",
      "unitInfo" : null,
      "name" : "org.springframework.boot.ant.FindMainClass"
    },
    "name" : "setProperty",
    "constructor" : false,
    "params" : [ "String" ],
    "returnType" : "void"
  }
}, {
  "comment" : "Quiet task that establishes a reference to its loader.",
  "tags" : [ {
    "name" : "@author",
    "fragments" : [ " Matt Benson" ]
  }, {
    "name" : "@since",
    "fragments" : [ " 1.3.0" ]
  } ],
  "sourcePoint" : {
    "@type" : "Type",
    "unitInfo" : {
      "packageName" : "org.springframework.boot.ant",
      "relativePath" : "main/java/org/springframework/boot/ant",
      "file" : "ShareAntlibLoader.java"
    },
    "name" : "org.springframework.boot.ant.ShareAntlibLoader"
  }
} ]

Reading javadoc metadata in runtime

You can turn the javadoc model from JSON back into the object model and work with it in the program by calling com.github.igorsuhorukov.javadoc.ReadJavaDocModel # readJavaDoc into a method, you must pass the path to the JSON file with javadoc (or to JSON compressed in .xz format) .

How to work with the model I will describe in the following publications about the generation of sequence diagram from tests

Tags:

Source Ripper and AST Tree Spring Boot

The javadoc ripper “under the hood”

How to extract javadoc in a real project

Reading javadoc metadata in runtime

Also popular now: