FindBugs vs CDK

    I am always interested in reading posts from PVS-Studio about how they look for bugs in some open source project. I decided that I, too, could write such a post, only about Java. There is an absolutely wonderful free static FindBugs Java code analyzer . Surprisingly little was written about him on Habré.

    In addition to the code analyzer, an experimental rabbit is required for such an article. A rather large project is needed, but not so widespread that developers lick code perfectly. I chose the Chemistry Development Kit project (version 1.4.19), which I used to use. I installed FindBugs as an Eclipse plugin, because I’m so used to it.

    FindBugs with default settings detected 338 problems. Of course, I will not describe them all, I will focus on interesting ones.

    1. Unsuccessful attempt to get a random positive number

    In the class org.openscience.cdk.math.RandomNumbersTool:
    public static int randomInt(int lo, int hi) {
        return (Math.abs(random.nextInt()) % (hi - lo + 1)) + lo;
    The Random.nextInt () method returns any number valid in int. The Math.abs method gets the module of the number. The problem is that Math.abs does not work for Integer.MIN_VALUE, because, as you know, the number opposite to this does not fit in int. However, Random.nextInt () may well produce this number (about once every 4 billion), then the whole method will fail.

    2. The result of BufferedReader.readLine () is not checked for null

    It is found many times, for example, in this form (
    private void skipUntilEmptyOrCommentLine(String line) throws IOException {
        // skip everything until empty line, or comment line
        while (line != null && line.length() > 0 && line.charAt(0) != '#') {
            line = input.readLine().trim();
    The input stream is not checked for readiness by the ready () method, and the result of readLine () is not checked for null. As a result, a malformed input file will throw a NullPointerException.

    3. Used & instead of &&

    A trivial error of this kind (org.openscience.cdk.atomtype.CDKAtomTypeMatcher):
    if (atom.getFormalCharge() != null &
        atom.getFormalCharge().intValue() == 1) {...}

    If the first check fails, the second will fail anyway and result in a NullPointerException.

    4. Comparison of String, Integer or Double objects by ==

    It occurs many times. For example, here (
    return (getAtomTypeName() == type.getAtomTypeName()) &&
            (maxBondOrder == type.maxBondOrder) &&
            (bondOrderSum == type.bondOrderSum);
    getAtomTypeName () returns a String, and bondOrderSum is of type Double. The application logic fully admits that here there will be different, but equal in equals objects and the comparison will work incorrectly.

    It is generally undesirable to use Integer, Double, and so on unless you have a good reason to use them.

    5. The programmer forgot that the lines are constant

    There are calls to methods of the String class that create a new string. For example (
    while (line != null) {
    	if (line.length() > 0) 
    	line = reader.readLine();
    Calling line.trim () is useless, since it does not change the line itself line, and no one uses the result. The author clearly had in mind line = line.trim (). Similarly, String.substring calls are encountered without saving the result.

    6. Comparison of objects of different types

    Comparisons are often made in the spirit of if( atom.equals("H") )atom type IAtom. It is understood, apparently if( atom.getSymbol().equals("H") ). In general, this is a mysterious place, since there are more than a dozen such errors, and, in my opinion, they should greatly influence semantics and distort the result.

    7. Using an uninitialized field

    public void setReactionMetadata(String metadata) {
        this.reactionInfo.add( metadata );
    FindBugs determines that the private reactionInfo field is not initialized in any other method, so this method will always throw a NullPointerException.

    8. Incorrect initialization of a static field

    For example, the class org.openscience.cdk.qsar.AtomValenceTool:
    public class AtomValenceTool {
        private static Map valencesTable = null;
        public static int getValence(IAtom atom) {
            if (valencesTable == null) {
                valencesTable = new HashMap();
                valencesTable.put("H", 1);
                valencesTable.put("He", 8);
                valencesTable.put("Ne", 8);
            return valencesTable.get(atom.getSymbol());
    When called from different threads, a race condition is possible when the valencesTable is no longer null, but not yet filled to the end. Then one thread will throw a NullPointerException for a perfectly valid atom.

    9. Violation of contracts

    The equals () method should return false if the argument is null. The clone () method should never return null. The clone () method should not call super.clone () in the final class, and not create the object manually (otherwise, if you inherit the class, clone () will break). Such things may not lead to errors, but still they should be avoided.

    10. Incorrect use of regular expression
    String path = file.getPath().replaceAll(File.separator, ".");
    The replaceAll method takes a regular expression as an argument. Under Windows, File.separator is a backslash that is specially interpreted in regular expressions, so this code will fall from PatternSyntaxException.

    11. Overridden method from parent constructor uses uninitialized variable

    An interesting situation in the class org.openscience.cdk.debug.DebugAtomContainer. There is a field declared
    ILoggingTool logger = LoggingToolFactory.createLoggingTool(DebugAtomContainer.class);

    and there is a method:
    public void addStereoElement(IStereoElement parity) {
        logger.debug("Adding stereo element: ", parity);
    The problem is that this method is called in one of the constructors of the base class when assigning a value to the logger variable has not yet worked. As a result, of course, a NullPointerException will occur.


    There were more mistakes, but let's stop. I want to note that CDK is a good library that generally works fine. And quite a few problems were found, not because the programmers are stupid. Normal programmers, everyone writes like that. It’s just that they, apparently, have not yet used static code analyzers. And you use it, it is useful!

    Also popular now: