PVS-Studio for Java is sent to the path. Next stop is Elasticsearch

    Picture 1

    Far from the first year, the PVS-Studio team has been blogging on checks of open-source projects by the static code analyzer of the same name. To date, more than 300 projects have been checked, and more than 12,000 cases have been written to the database of errors found. Initially, the analyzer was implemented to test C and C ++ code, then C # language support appeared. Therefore, among the tested projects, the majority (> 80%) falls on C and C ++. More recently, Java has been added to supported languages, which means that PVS-Studio opens the doors to a new world, and it is time to supplement the database with errors from Java projects.

    The Java world is huge and diverse, so my eyes widen when choosing a project to test a new analyzer. Ultimately, the choice fell on the full-text search engine and analytics Elasticsearch. This is a fairly successful project, and in successful projects, finding errors is doubly, or even triple, more pleasant. So what defects did PVS-Studio detect for Java? The result of the verification will be discussed in the article.

    Superficial acquaintance with Elasticsearch


    Elasticsearch is an open source scalable full-text search and analytics engine. It allows you to store large amounts of data, conduct among them a quick search and analytics (almost in real time). Typically, it is used as the underlying mechanism / technology that provides applications with complex functions and search requirements.

    Among the major sites using Elasticsearch, Wikimedia, StumbleUpon, Quora, Foursquare, SoundCloud, GitHub, Netflix, Amazon, IBM, Qbox are noted.

    I think with the acquaintance is enough.

    How was it


    There were no problems with the verification. The sequence of actions is quite simple and did not take much time:

    • Downloaded Elasticsearch from GitHub ;
    • I used the instructions for starting the Java analyzer and launched the analysis;
    • I received an analyzer report, analyzed it and highlighted interesting cases.

    Now let's get to the point.

    Caution! Possible NullPointerException


    V6008 Null dereference of 'line'. GoogleCloudStorageFixture.java (451)

    private static PathTrie defaultHandlers(....) {
      ....
      handlers.insert("POST /batch/storage/v1", (request) -> {
        ....
          // Reads the body
          line = reader.readLine();
          byte[] batchedBody = new byte[0];
          if ((line != null) || 
              (line.startsWith("--" + boundary) == false))       // <=
          {
            batchedBody = line.getBytes(StandardCharsets.UTF_8);
          }
        ....
      });
      ....
    }

    The error in this code fragment is that if they could not read the line from the buffer, then calling the startsWith method in the condition of the if statement will throw a NullPointerException . Most likely, this is a typo, and when writing the condition, the && operator was meant instead of || .

    V6008 Potential null dereference of 'followIndexMetadata'. TransportResumeFollowAction.java (171), TransportResumeFollowAction.java (170), TransportResumeFollowAction.java (194)

    void start(
               ResumeFollowAction.Request request,
               String clusterNameAlias,
               IndexMetaData leaderIndexMetadata,
               IndexMetaData followIndexMetadata,
               ....) throws IOException
    {
      MapperService mapperService = followIndexMetadata != null  // <=
                                    ? ....
                                    : null;
      validate(request, 
               leaderIndexMetadata,
               followIndexMetadata,                              // <=
               leaderIndexHistoryUUIDs,
               mapperService);
      ....
    }

    Another warning from the V6008 diagnostic. Now, a closer look has focused on the followIndexMetadata object . The start method takes several input arguments, including our suspect. Then, on the basis of checking our object for null , a new object is formed, which participates in the further logic of the method. Checking for null tells us that followIndexMetadata can still come from outside with a null object. Ok, look further.

    Next, the validation method is called with the pushing of many arguments (again, among which is the object in question). And if you look at the implementation of the validation method, then everything falls into place. Our potential null object is passed by the third argument to the validate method , where it is unconditionally dereferenced. As a result, a potential NullPointerException.

    static void validate(
          final ResumeFollowAction.Request request,
          final IndexMetaData leaderIndex,
          final IndexMetaData followIndex,                                   // <=
          ....) 
    {
      ....
      Map ccrIndexMetadata = followIndex.getCustomData(....); // <=
      if (ccrIndexMetadata == null) {
          throw new IllegalArgumentException(....);
      }
      ....
    }}

    It is not known with what arguments the start method is actually called. It is possible that checking all the arguments is done somewhere before the method call, and we do not face any dereferencing of the null object. But, you must admit that such a code implementation still looks rather unreliable and deserves attention.

    V6060 The 'node' reference was utilized before it was verified against null. RestTasksAction.java (152), RestTasksAction.java (151)

    private void buildRow(Table table, boolean fullId, 
                          boolean detailed, DiscoveryNodes discoveryNodes,
                          TaskInfo taskInfo) {
        ....
      DiscoveryNode node = discoveryNodes.get(nodeId);
      ....
      // Node information. Note that the node may be null because it has
      // left the cluster between when we got this response and now.
      table.addCell(fullId ? nodeId : Strings.substring(nodeId, 0, 4));
      table.addCell(node == null ? "-" : node.getHostAddress());
      table.addCell(node.getAddress().address().getPort());
      table.addCell(node == null ? "-" : node.getName());
      table.addCell(node == null ? "-" : node.getVersion().toString());
      ....
    }

    Another diagnostic rule worked here, but the problem is the same: NullPointerException . The rule says: “Guys, what are you doing? How so? Oh trouble! Why do you use the object first, and then in the next line of code check it for null ?! ” So it turns out here dereferencing a null object. Alas, even the comment of one of the developers did not help.

    V6060 The 'cause' reference was utilized before it was verified against null. StartupException.java (76), StartupException.java (73)

    private void printStackTrace(Consumer consumer) {
        Throwable originalCause = getCause();
        Throwable cause = originalCause;
        if (cause instanceof CreationException) {
            cause = getFirstGuiceCause((CreationException)cause);
        }
        String message = cause.toString();         // <=
        consumer.accept(message);
        if (cause != null) {                       // <=
            // walk to the root cause
            while (cause.getCause() != null) {
                cause = cause.getCause();
            }
            ....
        }
        ....
    }

    Here it is necessary to take into account that method getCause class Throwable can return null . Further, the problem considered above is repeated, and it makes no sense to explain something in detail.

    Meaningless conditions


    V6007 Expression 's.charAt (i)! =' \ T '' is always true. Cron.java (1223)

    private static int findNextWhiteSpace(int i, String s) {
      for (; i < s.length() && (s.charAt(i) != ' ' || s.charAt(i) != '\t'); i++)
      {
          // intentionally empty
      }
      return i;
    }

    The considered function returns the index of the first space, starting from index i . What's wrong? We have a warning from the analyzer that s.charAt (i)! = '\ T' is always true, which means that there will always be truth and the expression (s.charAt (i)! = '' || s.charAt (i)! = '\ t') . Is it so? I think you yourself can easily verify this by substituting any character.

    As a result, this method will always return an index equal to s.length () , which is not true. I dare to assume that this slightly higher located method is to blame:

    private static int skipWhiteSpace(int i, String s) {
      for (; i < s.length() && (s.charAt(i) == ' ' || s.charAt(i) == '\t'); i++)
      {
          // intentionally empty
      }
      return i;
    }

    We implemented this method, then copied it and made some minor corrections to get our erroneous findNextWhiteSpace method . The method was adjusted, adjusted, but not adjusted. To remedy the situation, you must use the && operator instead of || .

    V6007 Expression 'remaining == 0' is always false. PemUtils.java (439)

    private static byte[] 
    generateOpenSslKey(char[] password, byte[] salt, int keyLength) 
    {
      ....
      int copied = 0;
      int remaining;
      while (copied < keyLength) {
        remaining = keyLength - copied;
        ....
        copied += bytesToCopy;
        if (remaining == 0) {      // <=
            break;
        }
        ....
      }
      ....
    }

    From the condition of the cycle copied <keyLength, it can be noted that copied will always be less than keyLength . Hence, a comparison on the equality of the remaining variable with 0 is pointless and will always give a false result, and therefore the condition will not be exited from the loop. Is this code worth deleting, or is it still necessary to reconsider the logic of behavior? I think only developers will be able to dot all i.

    V6007 Expression 'healthCheckDn.indexOf (' = ')> 0' is always false. ActiveDirectorySessionFactory.java (73)

    
    ActiveDirectorySessionFactory(RealmConfig config,
                                  SSLService sslService,
                                  ThreadPool threadPool)
                                  throws LDAPException 
    {
      super(....,
            () -> {
              if (....) {
                  final String healthCheckDn = ....;
                  if (healthCheckDn.isEmpty() && 
                      healthCheckDn.indexOf('=') > 0) 
                  {
                      return healthCheckDn;
                  }
              }
              return ....;
            },
            ....);
      ....
    }

    Again a meaningless expression. According to the condition, for the lambda expression to return the string variable healthCheckDn , the string healthCheckDn must be both empty and the string containing the character '=' is not in the first position. Fuh, sort of sorted out. And as you correctly understood, this is impossible. We will not understand the logic of the code, leave it up to the developers.

    I gave only some erroneous examples, but in addition to this, there were plenty of cases where the V6007 diagnostics were triggered , which must be considered separately and conclusions drawn.

    The method is small, but udalenky


    private static byte char64(char x) {
      if ((int)x < 0 || (int)x > index_64.length)
        return -1;
      return index_64[(int)x];
    }

    So, we have a tiny method of several lines. But bugs do not sleep! An analysis of this method gave the following result:

    1. V6007 Expression '(int) x <0' is always false. BCrypt.java (429)
    2. V6025 Possibly index '(int) x' is out of bounds. BCrypt.java (431)

    Problem N1. The expression (int) x <0 is always false (Yes, yes, again V6007 ). The variable x cannot be negative, since it is of type char . The char type is an unsigned integer. This cannot be called a real mistake, but, nevertheless, the check is redundant and can be removed.

    Problem N2. Possible overflow of the array leading to an ArrayIndexOutOfBoundsException . Then the question begs, which lies on the surface: “Wait, what about checking the index?”

    So, we have a fixed-size array of 128 elements:

    private static final byte index_64[] = {
        -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1, -1, -1, 0, 1, 54, 55,
        56, 57, 58, 59, 60, 61, 62, 63, -1, -1,
        -1, -1, -1, -1, -1, 2, 3, 4, 5, 6,
        7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
        17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27,
        -1, -1, -1, -1, -1, -1, 28, 29, 30,
        31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
        41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
        51, 52, 53, -1, -1, -1, -1, -1
    };

    When the variable x enters the input of the char64 method , it checks for the validity of the index. Where is the gap? Why is the case of going out of the array still possible? Checking (int) x> index_64.length is not entirely correct. If the input method char64 come x with a value of 128, the check will not protect against ArrayIndexOutOfBoundsException . Perhaps this never happens in reality. However, the check is spelled incorrectly, and you need to replace the operator "greater than" (>) with "greater than or equal to" (> =).



    Comparisons that tried


    V6013 Numbers 'displaySize' and 'that.displaySize' are compared by reference. Possibly an equality comparison was intended. ColumnInfo.java (122)

    ....
    private final String table;
    private final String name;
    private final String esType;
    private final Integer displaySize;
    ....
    @Override
    public boolean equals(Object o) {
        if (this == o) {
            return true;
        }
        if (o == null || getClass() != o.getClass()) {
            return false;
        }
        ColumnInfo that = (ColumnInfo) o;
        return displaySize == that.displaySize &&    // <=
               Objects.equals(table, that.table) &&
               Objects.equals(name, that.name) &&
               Objects.equals(esType, that.esType);
    }

    The incorrectness here is that displaySize objects of type Integer are compared via the == operator , that is, they are compared by reference. It is quite possible that ColumnInfo objects will be compared , for which the displaySize fields have different links but the same content. And in this case, the comparison will give us a negative result, while we expected the truth.

    I would venture to suggest that such a comparison could be the result of failed refactoring, and initially the displaySize field was of type int .

    V6058 The 'equals' function compares objects of incompatible types: Integer, TimeValue. DatafeedUpdate.java (375)

    ....
    private final TimeValue queryDelay;
    private final TimeValue frequency;
    ....
    private final Integer scrollSize;
    ....
    boolean isNoop(DatafeedConfig datafeed)
    {
      return (frequency == null 
              || Objects.equals(frequency, datafeed.getFrequency()))
        && (queryDelay == null 
            || Objects.equals(queryDelay, datafeed.getQueryDelay()))
        && (scrollSize == null
            || Objects.equals(scrollSize, datafeed.getQueryDelay())) // <=
        && ....)
    }

    And again incorrect comparison of objects. Now compare objects whose types are incompatible ( Integer and TimeValue ). The result of this comparison is obvious, and it is always false. It can be seen that the fields of the class are compared in the same way with each other, it is only necessary to change the field names. So, the developer decided to speed up the process of writing code with copy-paste, but he awarded himself the same bug. The class implements a getter for the scrollSize field , so to correct the error, the correct solution would be to use the appropriate method: datafeed .getScrollSize () .

    Let's look at a couple more examples of errors without any explanation. The problem is already obvious.

    V6001There are identical sub-expressions 'tookInMillis' to the left and to the right of the '==' operator. TermVectorsResponse.java (152)

    @Override
    public boolean equals(Object obj) {
      ....
      return index.equals(other.index)
          && type.equals(other.type)
          && Objects.equals(id, other.id)
          && docVersion == other.docVersion
          && found == other.found
          && tookInMillis == tookInMillis                        // <=
          && Objects.equals(termVectorList, other.termVectorList);
    }

    V6009 Function 'equals' receives an odd argument. An object 'shardId.getIndexName ()' is used as an argument to its own method. SnapshotShardFailure.java (208)

    @Override
    public boolean equals(Object o) {
      ....
      return shardId.id() == that.shardId.id() &&
          shardId.getIndexName().equals(shardId.getIndexName()) &&   // <=
          Objects.equals(reason, that.reason) &&
          Objects.equals(nodeId, that.nodeId) &&
          status.getStatus() == that.status.getStatus();
    }

    Miscellaneous


    V6006 The object was created but it is not being used. The 'throw' keyword could be missing. JdbcConnection.java (88)

    @Override
    public void setAutoCommit(boolean autoCommit) throws SQLException {
        checkOpen();
        if (!autoCommit) {
            new SQLFeatureNotSupportedException(....);
        }
    }

    The bug is obvious and requires no explanation. The developer threw an exception, but in no way throws it further. Such an anonymous exception will be successfully created, and also successfully and, most importantly, destroyed without a trace. The reason is the lack of a throw statement .

    V6003 The use of 'if (A) {....} else if (A) {....}' pattern was detected. There is a probability of logical error presence. MockScriptEngine.java (94), MockScriptEngine.java (105)

    @Override
    public  T compile(....) {
      ....
      if (context.instanceClazz.equals(FieldScript.class)) {
        ....
      } else if (context.instanceClazz.equals(FieldScript.class)) {
        ....
      } else if(context.instanceClazz.equals(TermsSetQueryScript.class)) {
        ....
      } else if (context.instanceClazz.equals(NumberSortScript.class)) 
      ....
    }

    In a multiple if-else construct, one of the conditions is repeated twice, so the situation requires a competent review of the code.

    V6039 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless. SearchAfterBuilder.java (94), SearchAfterBuilder.java (93)

    public SearchAfterBuilder setSortValues(Object[] values) {
      ....
      for (int i = 0; i < values.length; i++) {
          if (values[i] == null) continue;
          if (values[i] instanceof String) continue;
          if (values[i] instanceof Text) continue;
          if (values[i] instanceof Long) continue;
          if (values[i] instanceof Integer) continue;
          if (values[i] instanceof Short) continue;
          if (values[i] instanceof Byte) continue;
          if (values[i] instanceof Double) continue;
          if (values[i] instanceof Float) continue;
          if (values[i] instanceof Boolean) continue; // <=
          if (values[i] instanceof Boolean) continue; // <=
          throw new IllegalArgumentException(....);
      }
      ....
    }

    The same condition is used twice in a row. The second condition is superfluous, or is it necessary to use a different type instead of Boolean ?

    V6009 Function 'substring' receives an odd arguments. The 'queryStringIndex + 1' argument should not be greater than 'queryStringLength'. LoggingAuditTrail.java (660)

    LogEntryBuilder withRestUriAndMethod(RestRequest request) {
      final int queryStringIndex = request.uri().indexOf('?');
      int queryStringLength = request.uri().indexOf('#');
      if (queryStringLength < 0) {
          queryStringLength = request.uri().length();
      }
      if (queryStringIndex < 0) {
          logEntry.with(....);
      } else {
          logEntry.with(....);
      }
      if (queryStringIndex > -1) {
          logEntry.with(....,
                        request.uri().substring(queryStringIndex + 1,// <=
                                                queryStringLength)); // <=
      }
      ....
    }

    Immediately consider an error scenario that could throw a StringIndexOutOfBoundsException . An exception will occur when request.uri () returns a string that contains the character '#' earlier than '?'. For such a case, there are no checks in the method, and if this still happens, then disaster will not be avoided. Perhaps this will never happen due to various checks of the request object outside the method, but in my opinion, hoping for this is not the best idea.

    Conclusion


    Over the years, PVS-Studio has helped to find flaws in the code of commercial and free open-source projects. More recently, Java has been added to support for parsed languages. And one of the first tests for our newcomer was the actively developing Elasticsearch. We hope that this check will be useful for the project and interesting for readers.

    For PVS-Studio for Java to quickly adapt to a new world for itself, we need new tests, new users, active feedback and customers :). So, I suggest, without delay, download and test our analyzer on your working draft!



    If you want to share this article with an English-speaking audience, then please use the link to the translation: Maxim Stefanov. PVS-Studio for Java hits the road. Next stop is Elasticsearch

    Also popular now: