Making the code cleaner: What can be fixed in the Linux kernel

    Surely many would like to try to change something in the Linux kernel for the better, but do not know where to start. I want to describe several problems that everyone can fix, and use an example to show the way from finding a problem to publishing its fix on the mailing list. In the course of the story, the reader will get acquainted with some auxiliary utilities.

    From year to year, the same trivial problems wander from driver to driver, often associated with ignorance of some standard practices, utilities or extensions that exist in the Linux kernel.

    Here is a short list of these kinds of problems:



    Typos and clerical errors



    Typos and typos in the documentation and comments are not uncommon. One person, namely Lucas De Marchi, developed a special codespell utility to catch such typos.

    The following example will tell everything about himself.

    Clone the kernel:
    mkdir ~/devel
    cd ~/devel
    git clone git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
    cd ~/devel/linux-next
    

    Please note that we will work with the freshest, that is, with the linux-next tree.

    We make a test run codespell:
    $ codespell.py drivers/staging/unisys
    drivers/staging/unisys/include/guestlinuxdebug.h:138: doesnt  ==> doesn't
    

    Now fix:
    $ codespell.py -w -i 3 drivers/staging/unisys
                                                             * doesnt show, so we
            doesnt  ==> doesn't (Y/n) y
    FIXED: drivers/staging/unisys/include/guestlinuxdebug.h
    

    We look at what happened:
    --- a/drivers/staging/unisys/include/guestlinuxdebug.h
    +++ b/drivers/staging/unisys/include/guestlinuxdebug.h
    @@ -135,7 +135,7 @@ enum event_pc {                     /* POSTCODE event identifier tuples */
     #define POSTCODE_SEVERITY_ERR DIAG_SEVERITY_ERR
     #define POSTCODE_SEVERITY_WARNING DIAG_SEVERITY_WARNING
     #define POSTCODE_SEVERITY_INFO DIAG_SEVERITY_PRINT     /* TODO-> Info currently
    -                                                        * doesnt show, so we
    +                                                        * doesn't show, so we
                                                             * set info=warning */
     /* example call of POSTCODE_LINUX_2(VISOR_CHIPSET_PC, POSTCODE_SEVERITY_ERR);
      * Please also note that the resulting postcode is in hex, so if you are
    



    Custom implementation of output



    Not so much a problem as an improvement in code readability and microoptimization: often a significant reduction in the amount of memory used on the stack when a function is called, a decrease in the number of passed specifiers in vsnprintf ().

    Earlier, I described the special extensions of the% p specifier in the kernel; now it is the turn to apply the acquired knowledge.


    For simplicity, take the
    % 02x [-:]% 02x [-:]% 02x pattern . It allows you to find the transmission of several bytes through the stack, which can be replaced by the extension % * ph [CDN] .

    Let's look in the code:
    $ git grep -n -i -e '%02x[-: ]%02x[-: ]%02x' drivers/staging/unisys
    drivers/staging/unisys/virtpci/virtpci.c:1313:                                  "[%d:%d] VNic:%02x:%02x:%02x:%02x:%02x:%02x num_rcv_bufs:%d mtu:%d",
    


    What we will do next, I will describe in the example below. In the meantime, we move on to the following problem areas.


    Custom implementation of algorithms



    Here, take, for example, drivers / staging / fbtft / fbtft-bus.c, lines 99-100:
    for (i = 0; i < pad; i++)
      *buf++ = 0x000;
    

    pad is defined as u16 and can be in the range from 0 to 3, i.e. from 0 to 6 bytes. As we know, memset () is a terribly optimized function, especially on small sizes. Why not apply?

    Or another example from the same driver, namely drivers / staging / fbtft / fbtft-core.c, lines 1091-1096:
      /* make debug message */
      msg[0] = '\0';
      for (j = 0; j < i; j++) {
          snprintf(str, 128, " %02X", buf[j]);
          strcat(msg, str);
      }
    

    People didn’t know that there is bin2hex () in the kernel, not to mention that strcat () is completely superfluous - snprintf () adds the terminating '\ 0'.

    Try to modify it yourself.

    Has anyone already seen how you can simplify this?
    Actually, the buffer is needed for dump output in hexadecimal form, so we delete the loop, the msg variable and replace it all with either the % * ph specifier with the passed length i, or by calling print_hex_bytes ().

    Pay attention further on the code is similar, it is possible for the company and to optimize it: lines 1192-1202.



    Defining Existing Constants and Data Types



    Returning to the unisys driver, run the following command:
    $ git grep -n MAX_MACADDR_LEN drivers/staging/unisys/
    drivers/staging/unisys/common-spar/include/channels/iochannel.h:190:#ifndef MAX_MACADDR_LEN
    drivers/staging/unisys/common-spar/include/channels/iochannel.h:191:#define MAX_MACADDR_LEN 6   /* number of bytes…
    drivers/staging/unisys/common-spar/include/channels/iochannel.h:192:#endif
    …и так далее…
    


    However, it is worth noting that the MAC address length constant has long been defined in the kernel as ETH_ALEN. I am sure that the maintainers will gladly accept a patch from you that replaces their definition with a standard nuclear one.


    Correction Example



    We proceed smoothly to practice. Above, we found a place where the output of several bytes uses the transfer of each of them through the stack.

    If we look into the code, we will see the following:
    str_pos += scnprintf(vbuf + str_pos, len - str_pos,
          "[%d:%d] VNic:%02x:%02x:%02x:%02x:%02x:%02x num_rcv_bufs:%d mtu:%d",
          tmpvpcidev->bus_no,
          tmpvpcidev->device_no,
          tmpvpcidev->net.mac_addr[0],
          tmpvpcidev->net.mac_addr[1],
          tmpvpcidev->net.mac_addr[2],
          tmpvpcidev->net.mac_addr[3],
          tmpvpcidev->net.mac_addr[4],
          tmpvpcidev->net.mac_addr[5],
          tmpvpcidev->net.num_rcv_bufs,
          tmpvpcidev->net.mtu);
    

    And it turns out that someone displays the MAC address! Well, we can easily use the special qualifier extension - % pM .

    Let's replace and look at the result:
    --- a/drivers/staging/unisys/virtpci/virtpci.c
    +++ b/drivers/staging/unisys/virtpci/virtpci.c
    @@ -1310,15 +1310,10 @@ static ssize_t info_debugfs_read(struct file *file, char __user *buf,
                                            tmpvpcidev->scsi.max.cmd_per_lun);
                    } else {
                            str_pos += scnprintf(vbuf + str_pos, len - str_pos,
    -                                       "[%d:%d] VNic:%02x:%02x:%02x:%02x:%02x:%02x num_rcv_bufs:%d mtu:%d",
    +                                       "[%d:%d] VNic:%pM num_rcv_bufs:%d mtu:%d",
                                            tmpvpcidev->bus_no,
                                            tmpvpcidev->device_no,
    -                                       tmpvpcidev->net.mac_addr[0],
    -                                       tmpvpcidev->net.mac_addr[1],
    -                                       tmpvpcidev->net.mac_addr[2],
    -                                       tmpvpcidev->net.mac_addr[3],
    -                                       tmpvpcidev->net.mac_addr[4],
    -                                       tmpvpcidev->net.mac_addr[5],
    +                                       tmpvpcidev->net.mac_addr,
                                            tmpvpcidev->net.num_rcv_bufs,
                                            tmpvpcidev->net.mtu);
                    }
    

    It seems to be nice - 5 lines and fewer variables on the stack. It is still worth compiling the result. I will not describe in detail how this is done, I will only indicate that it will be necessary to enable the driver in the configuration using the options:
    CONFIG_STAGING = y
    CONFIG_UNISYSPAR = y
    CONFIG_UNISYS_VIRTPCI = m

    We save our change in the tree with git commit -a -sand format it as a patch.
    $ git format-patch HEAD~1
    0001-staging-unisys-print-MAC-address-via-pM.patch
    

    Next, we will use a wonderful script get_maintainter.plto find out who needs to be informed personally.
    $ scripts/get_maintainer.pl --git-min-percent=67 --nor --norolestats 00*
    Benjamin Romer 
    David Kershner 
    Greg Kroah-Hartman 
    sparmaintainer@unisys.com
    devel@driverdev.osuosl.org
    linux-kernel@vger.kernel.org
    

    We send our patch to the addresses:
    $ git send-email --cc-cmd 'scripts/get_maintainer.pl --git-min-percent=67 --nor --nol --norolestats' 00*
    0001-staging-unisys-print-MAC-address-via-pM.patch
    Who should the emails be sent to (if any)? devel@driverdev.osuosl.org, sparmaintainer@unisys.com
    Message-ID to be used as In-Reply-To for the first email (if any)? 
    …
    Send this email? ([y]es|[n]o|[q]uit|[a]ll): y
    

    Here is the little letter .

    UPDATE: Already in the kernel: 9a836c0a6310e6e9 .

    Also popular now: