GNU Arm Embedded Toolchain added in PVS-Studio

Embedded systems have long been firmly established in our lives. The requirements for their stability and reliability are very high, and error correction is expensive. Therefore, for embedded developers it is especially important to regularly use specialized tools to ensure the quality of the source code. This article tells about the appearance of support for GNU Arm Embedded Toolchain in the PVS-Studio analyzer and code defects found in the Mbed OS project.

Introduction

The PVS-Studio analyzer already supports several commercial compilers for embedded systems, for example:

Now another developer tool is added to the support - the GNU Embedded Toolchain.

GNU Embedded Toolchain is a collection of compilers from Arm, based on the GNU Compiler Collection. The first official release took place in 2012, and since then the project has been developing along with the GCC.

The main purpose of GNU Embedded Toolchain is to generate code that runs on bare metal (bare metal), that is, directly on a processor without an interlayer in the form of an operating system. The package includes compilers for C and C ++, assembler, the GNU Binutils toolkit, and the Newlib library . The source code of all components is fully open and distributed under the GNU GPL license. From the official site you can download versions for Windows, Linux and macOS.

Mbed os

To test the analyzer, you need as much source code as possible. Usually there are no problems with this, but when we deal with embedded development aimed primarily at IoT devices, it can be difficult to find a sufficient number of large projects. Fortunately, this problem was solved by using specialized operating systems, the source code of which is in most cases open. Then we will discuss one of them.

Although the main purpose of the article is to talk about support for the GNU Embedded Toolchain, it’s hard to write a lot about it. Moreover, the readers of our articles are waiting for the description of some interesting mistakes. Well, let's not deceive their expectations and run the analyzer on the Mbed OS project. This is an open source operating system that is being developed with Arm.

Official site: https://www.mbed.com/

Source code: https://github.com/ARMmbed/mbed-os

The choice on Mbed OS was not accidental, this is how the authors describe the project:

It is an open source operating system for the Internet of Things. It includes a microcontroller, including security, connectivity, an I / O devices.

This is an ideal build project using the GNU Embedded Toolchain, especially considering Arm’s participation in its development. I’ll just say that I didn’t have the goal of finding and showing as many errors as possible in a particular project, so the test results are briefly reviewed.

Errors

During the verification of the Mbed OS code, the PVS-Studio analyzer issued 693 warnings, 86 of them with high priority. I will not consider them all in detail, especially since many of them are repeated or do not represent much interest. For example, the analyzer issued many warnings V547 (Expression is always true / false) related to the same type of code fragments. The analyzer can be configured to significantly reduce the number of false and uninteresting positives, but there was no such task when writing an article. Those interested can see an example of such a configuration described in the article " Characteristics of the PVS-Studio analyzer on the example of EFL Core Libraries, 10-15% of false positives ."

For the article, I selected several interesting errors to demonstrate the operation of the analyzer.

Memory leaks

Let's start with a common class of errors in C and C ++ - memory leaks.

Analyzer Warning: V773 CWE-401 The function was exited without reading the 'read_buf' pointer. A memory leak is possible. cfstore_test.c 565

int32_t cfstore_test_init_1(void)
{
   ....
  read_buf = (char*) malloc(max_len);
  if(read_buf == NULL) {
    CFSTORE_ERRLOG(....);
    return ret;
  }
  ....
  while(node->key_name != NULL)
  {
    ....
    ret = drv->Create(....);
    if(ret < ARM_DRIVER_OK){
      CFSTORE_ERRLOG(....);
      return ret;              // <=
    }
  ....
  free(read_buf);
  return ret;
}

The classic situation when working with dynamic memory. The buffer allocated with malloc is used only inside the function and is released before exiting. The problem is that this does not happen if the function stops working ahead of time. Notice the same code in the if blocks . Most likely, the author copied the top fragment and forgot to add the free call .

Another example, similar to the previous one.

Analyzer Warning: V773 CWE-401 The function of the interface. A memory leak is possible. nanostackemacinterface.cpp 204

nsapi_error_t Nanostack::add_ethernet_interface(
    EMAC &emac,
    bool default_if,
    Nanostack::EthernetInterface **interface_out,
    constuint8_t *mac_addr)
{
  ....
  Nanostack::EthernetInterface *interface;
  interface = new (nothrow) Nanostack::EthernetInterface(*single_phy);
  if (!interface) {
    return NSAPI_ERROR_NO_MEMORY;
  }
  nsapi_error_t err = interface->initialize();
  if (err) {
    return err;              // <=
  }
  *interface_out = interface;
  return NSAPI_ERROR_OK;
}

A pointer to the allocated memory is returned via the output parameter, but only if the call to initialize is successful, and if an error occurs, a leak occurs because the local interface variable goes out of scope, and the pointer is simply lost. Here, you should either call delete , or at least give the address stored in the interface variable in any case, so that the calling code can take care of this.

Memset

Using the memset function often leads to errors, examples of related problems can be found in the article “ The most dangerous function in the C / C ++ world ”.

Consider the following analyzer warning:

V575 CWE-628 The 'memset' function processes '0' elements. Inspect the third argument. mbed_error.c 282

mbed_error_status_t mbed_clear_all_errors(void)
{
    ....
    //Clear the error and context capturing buffermemset(&last_error_ctx, sizeof(mbed_error_ctx), 0);
    //reset error count to 0
    error_count = 0;
    ....
}

The programmer intended to reset the memory occupied by the last_error_ctx structure , but mixed up the second and third arguments. As a result, 0 bytes is filled with the sizeof (mbed_error_ctx) value .

Exactly the same error is present a hundred lines above:

V575 CWE-628 The 'memset' function processes '0' elements. Inspect the third argument. mbed_error.c 123

Unconditional 'return' statement in a loop

Analyzer Warning: V612 CWE-670 An unconditional 'return' within a loop. thread_network_data_storage.c 2348

boolthread_nd_service_anycast_address_mapping_from_network_data(
          thread_network_data_cache_entry_t *networkDataList,
          uint16_t *rlocAddress,
          uint8_t S_id){
  ns_list_foreach(thread_network_data_service_cache_entry_t,
                  curService, &networkDataList->service_list) {
    // Go through all servicesif (curService->S_id != S_id) {
      continue;
    }
    ns_list_foreach(thread_network_data_service_server_entry_t,
                    curServiceServer, &curService->server_list) {
      *rlocAddress = curServiceServer->router_id;
      returntrue;                     // <=
    }
  }
  returnfalse;
}

In this snippet, ns_list_foreach is a macro that expands into a for statement . The inner loop performs no more than one iteration because of the call to return immediately after the line in which the output parameter of the function is initialized. Perhaps this code works as intended, but using the inner loop looks rather strange in this context. Most likely, the initialization of rlocAddress and exit from the function should be performed according to the condition, or you can get rid of the internal loop.

Errors in conditions

As I said above, the analyzer issued a fairly large number of uninteresting warnings from the V547 , so I studied them briefly and wrote down only two cases for the article.

V547 CWE-570 Expression 'pcb-> state == LISTEN' is always false. lwip_tcp.c 689

enum tcp_state {
  CLOSED      = 0,
  LISTEN      = 1,
  ....
};
struct tcp_pcb *
tcp_listen_with_backlog_and_err(struct tcp_pcb *pcb, u8_t backlog, err_t *err){
  ....
  LWIP_ERROR("tcp_listen: pcb already connected",
             pcb->state == CLOSED,
             res = ERR_CLSD; goto done);
  /* already listening? */if (pcb->state == LISTEN) {               // <=
    lpcb = (struct tcp_pcb_listen*)pcb;
    res = ERR_ALREADY;
    goto done;
  }
  ....
}

The analyzer considers that the condition pcb-> state == LISTEN is always false, let's see why.

Before the if statement , the macro LWIP_ERROR is used , which, according to the logic of its work, resembles assert . His ad looks like this:

#define LWIP_ERROR(message, expression, handler) do { if (!(expression)) { \
  LWIP_PLATFORM_ERROR(message); handler;}} while(0)

If the condition is false, the macro reports an error and executes the code passed through the handler parameter , in this code fragment it is an unconditional jump using goto .

In this example, the condition 'pcb-> state == CLOSED' is checked, that is, a transition to the done label occurs when pcb-> state has any other value. The if statement that follows the LWIP_ERROR call checks pcb-> state for LISTEN equality , but this condition is never met because the state in this line can only contain the CLOSED value .

Consider another warning related to the conditions: V517 CWE-570 The use of if (A) {...} else if (A) {...} 'pattern was detected. There is a possibility of logical error presence. Check lines: 62, 65. libdhcpv6_server.c 62

staticvoidlibdhcpv6_address_generate(....){
  ....
  if (entry->linkType == DHCPV6_DUID_HARDWARE_EUI64_TYPE) // <=
  {
    memcpy(ptr, entry->linkId, 8);
   *ptr ^= 2;
  }
  elseif (entry->linkType == DHCPV6_DUID_HARDWARE_EUI64_TYPE)// <=
  {
    *ptr++  = entry->linkId[0] ^ 2;
    *ptr++  = entry->linkId[1];
  ....
  }
}

Here, if and else if check the same condition, as a result of which the code in the body of the else if is never executed. Such errors often occur when writing code using the ' copy-paste ' method .

Ownerless expression

Finally, let's look at a funny code snippet.

Analyzer warning: V607 Ownerless expression '& discover_response_tlv'. thread_discovery.c 562

staticintthread_discovery_response_send(
                        thread_discovery_class_t *class,
                        thread_discovery_response_msg_t *msg_buffers){
  ....
  thread_extension_discover_response_tlv_write(
             &discover_response_tlv, class->version,
             linkConfiguration->securityPolicy);
  ....
}

And now let's take a look at the thread_extension_discover_response_tlv_write macro declaration :

#define thread_extension_discover_response_tlv_write \
( data, version, extension_bit)\
(data)

The macro is expanded into the data argument, that is, its call inside the thread_discovery_response_send function after preprocessing becomes an expression (& discover_response_tlv) .

I have no comments. Probably, this is not a mistake, but such a code always introduces me to a state similar to the image in the picture :).

Conclusion

The list of compilers supported in PVS-Studio has increased. If you have a project that is designed to be built using the GNU Arm Embedded Toolchain, I suggest you try checking it out using our analyzer. Download the demo version here . Also pay attention to the free license option , which is suitable for some smaller projects.

If you want to share this article with an English-speaking audience, then please use the link to the translation: Yuri Minaev. PVS-Studio Now Supports GNU Arm Embedded Toolchain .

Tags:

﻿GNU Arm Embedded Toolchain added in PVS-Studio