Making the code cleaner: What can be fixed in the Linux kernel
Surely many would like to try to change something in the Linux kernel for the better, but do not know where to start. I want to describe several problems that everyone can fix, and use an example to show the way from finding a problem to publishing its fix on the mailing list. In the course of the story, the reader will get acquainted with some auxiliary utilities.
From year to year, the same trivial problems wander from driver to driver, often associated with ignorance of some standard practices, utilities or extensions that exist in the Linux kernel.
Here is a short list of these kinds of problems:
Typos and typos in the documentation and comments are not uncommon. One person, namely Lucas De Marchi, developed a special codespell utility to catch such typos.
The following example will tell everything about himself.
Clone the kernel:
Please note that we will work with the freshest, that is, with the linux-next tree.
We make a test run
Now fix:
We look at what happened:
Not so much a problem as an improvement in code readability and microoptimization: often a significant reduction in the amount of memory used on the stack when a function is called, a decrease in the number of passed specifiers in vsnprintf ().
Earlier, I described the special extensions of the% p specifier in the kernel; now it is the turn to apply the acquired knowledge.
For simplicity, take the
% 02x [-:]% 02x [-:]% 02x pattern . It allows you to find the transmission of several bytes through the stack, which can be replaced by the extension % * ph [CDN] .
Let's look in the code:
What we will do next, I will describe in the example below. In the meantime, we move on to the following problem areas.
Here, take, for example, drivers / staging / fbtft / fbtft-bus.c, lines 99-100:
pad is defined as u16 and can be in the range from 0 to 3, i.e. from 0 to 6 bytes. As we know, memset () is a terribly optimized function, especially on small sizes. Why not apply?
Or another example from the same driver, namely drivers / staging / fbtft / fbtft-core.c, lines 1091-1096:
People didn’t know that there is bin2hex () in the kernel, not to mention that strcat () is completely superfluous - snprintf () adds the terminating '\ 0'.
Try to modify it yourself.
Returning to the unisys driver, run the following command:
However, it is worth noting that the MAC address length constant has long been defined in the kernel as ETH_ALEN. I am sure that the maintainers will gladly accept a patch from you that replaces their definition with a standard nuclear one.
We proceed smoothly to practice. Above, we found a place where the output of several bytes uses the transfer of each of them through the stack.
If we look into the code, we will see the following:
And it turns out that someone displays the MAC address! Well, we can easily use the special qualifier extension - % pM .
Let's replace and look at the result:
It seems to be nice - 5 lines and fewer variables on the stack. It is still worth compiling the result. I will not describe in detail how this is done, I will only indicate that it will be necessary to enable the driver in the configuration using the options:
We save our change in the tree with
Next, we will use a wonderful script
We send our patch to the addresses:
Here is the little letter .
UPDATE: Already in the kernel: 9a836c0a6310e6e9 .
From year to year, the same trivial problems wander from driver to driver, often associated with ignorance of some standard practices, utilities or extensions that exist in the Linux kernel.
Here is a short list of these kinds of problems:
- typos and typos in the documentation and comments
- own implementation of displaying the contents of structures or their fields on the screen
- native implementation of algorithms that are already provided in the kernel library
- definition of existing constants and data types
Typos and clerical errors
Typos and typos in the documentation and comments are not uncommon. One person, namely Lucas De Marchi, developed a special codespell utility to catch such typos.
The following example will tell everything about himself.
Clone the kernel:
mkdir ~/devel
cd ~/devel
git clone git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
cd ~/devel/linux-next
Please note that we will work with the freshest, that is, with the linux-next tree.
We make a test run
codespell
:$ codespell.py drivers/staging/unisys
drivers/staging/unisys/include/guestlinuxdebug.h:138: doesnt ==> doesn't
Now fix:
$ codespell.py -w -i 3 drivers/staging/unisys
* doesnt show, so we
doesnt ==> doesn't (Y/n) y
FIXED: drivers/staging/unisys/include/guestlinuxdebug.h
We look at what happened:
--- a/drivers/staging/unisys/include/guestlinuxdebug.h
+++ b/drivers/staging/unisys/include/guestlinuxdebug.h
@@ -135,7 +135,7 @@ enum event_pc { /* POSTCODE event identifier tuples */
#define POSTCODE_SEVERITY_ERR DIAG_SEVERITY_ERR
#define POSTCODE_SEVERITY_WARNING DIAG_SEVERITY_WARNING
#define POSTCODE_SEVERITY_INFO DIAG_SEVERITY_PRINT /* TODO-> Info currently
- * doesnt show, so we
+ * doesn't show, so we
* set info=warning */
/* example call of POSTCODE_LINUX_2(VISOR_CHIPSET_PC, POSTCODE_SEVERITY_ERR);
* Please also note that the resulting postcode is in hex, so if you are
Custom implementation of output
Not so much a problem as an improvement in code readability and microoptimization: often a significant reduction in the amount of memory used on the stack when a function is called, a decrease in the number of passed specifiers in vsnprintf ().
Earlier, I described the special extensions of the% p specifier in the kernel; now it is the turn to apply the acquired knowledge.
For simplicity, take the
% 02x [-:]% 02x [-:]% 02x pattern . It allows you to find the transmission of several bytes through the stack, which can be replaced by the extension % * ph [CDN] .
Let's look in the code:
$ git grep -n -i -e '%02x[-: ]%02x[-: ]%02x' drivers/staging/unisys
drivers/staging/unisys/virtpci/virtpci.c:1313: "[%d:%d] VNic:%02x:%02x:%02x:%02x:%02x:%02x num_rcv_bufs:%d mtu:%d",
What we will do next, I will describe in the example below. In the meantime, we move on to the following problem areas.
Custom implementation of algorithms
Here, take, for example, drivers / staging / fbtft / fbtft-bus.c, lines 99-100:
for (i = 0; i < pad; i++)
*buf++ = 0x000;
pad is defined as u16 and can be in the range from 0 to 3, i.e. from 0 to 6 bytes. As we know, memset () is a terribly optimized function, especially on small sizes. Why not apply?
Or another example from the same driver, namely drivers / staging / fbtft / fbtft-core.c, lines 1091-1096:
/* make debug message */
msg[0] = '\0';
for (j = 0; j < i; j++) {
snprintf(str, 128, " %02X", buf[j]);
strcat(msg, str);
}
People didn’t know that there is bin2hex () in the kernel, not to mention that strcat () is completely superfluous - snprintf () adds the terminating '\ 0'.
Try to modify it yourself.
Has anyone already seen how you can simplify this?
Actually, the buffer is needed for dump output in hexadecimal form, so we delete the loop, the msg variable and replace it all with either the % * ph specifier with the passed length i, or by calling print_hex_bytes ().
Pay attention further on the code is similar, it is possible for the company and to optimize it: lines 1192-1202.
Pay attention further on the code is similar, it is possible for the company and to optimize it: lines 1192-1202.
Defining Existing Constants and Data Types
Returning to the unisys driver, run the following command:
$ git grep -n MAX_MACADDR_LEN drivers/staging/unisys/
drivers/staging/unisys/common-spar/include/channels/iochannel.h:190:#ifndef MAX_MACADDR_LEN
drivers/staging/unisys/common-spar/include/channels/iochannel.h:191:#define MAX_MACADDR_LEN 6 /* number of bytes…
drivers/staging/unisys/common-spar/include/channels/iochannel.h:192:#endif
…и так далее…
However, it is worth noting that the MAC address length constant has long been defined in the kernel as ETH_ALEN. I am sure that the maintainers will gladly accept a patch from you that replaces their definition with a standard nuclear one.
Correction Example
We proceed smoothly to practice. Above, we found a place where the output of several bytes uses the transfer of each of them through the stack.
If we look into the code, we will see the following:
str_pos += scnprintf(vbuf + str_pos, len - str_pos,
"[%d:%d] VNic:%02x:%02x:%02x:%02x:%02x:%02x num_rcv_bufs:%d mtu:%d",
tmpvpcidev->bus_no,
tmpvpcidev->device_no,
tmpvpcidev->net.mac_addr[0],
tmpvpcidev->net.mac_addr[1],
tmpvpcidev->net.mac_addr[2],
tmpvpcidev->net.mac_addr[3],
tmpvpcidev->net.mac_addr[4],
tmpvpcidev->net.mac_addr[5],
tmpvpcidev->net.num_rcv_bufs,
tmpvpcidev->net.mtu);
And it turns out that someone displays the MAC address! Well, we can easily use the special qualifier extension - % pM .
Let's replace and look at the result:
--- a/drivers/staging/unisys/virtpci/virtpci.c
+++ b/drivers/staging/unisys/virtpci/virtpci.c
@@ -1310,15 +1310,10 @@ static ssize_t info_debugfs_read(struct file *file, char __user *buf,
tmpvpcidev->scsi.max.cmd_per_lun);
} else {
str_pos += scnprintf(vbuf + str_pos, len - str_pos,
- "[%d:%d] VNic:%02x:%02x:%02x:%02x:%02x:%02x num_rcv_bufs:%d mtu:%d",
+ "[%d:%d] VNic:%pM num_rcv_bufs:%d mtu:%d",
tmpvpcidev->bus_no,
tmpvpcidev->device_no,
- tmpvpcidev->net.mac_addr[0],
- tmpvpcidev->net.mac_addr[1],
- tmpvpcidev->net.mac_addr[2],
- tmpvpcidev->net.mac_addr[3],
- tmpvpcidev->net.mac_addr[4],
- tmpvpcidev->net.mac_addr[5],
+ tmpvpcidev->net.mac_addr,
tmpvpcidev->net.num_rcv_bufs,
tmpvpcidev->net.mtu);
}
It seems to be nice - 5 lines and fewer variables on the stack. It is still worth compiling the result. I will not describe in detail how this is done, I will only indicate that it will be necessary to enable the driver in the configuration using the options:
CONFIG_STAGING = y
CONFIG_UNISYSPAR = y
CONFIG_UNISYS_VIRTPCI = m
We save our change in the tree with
git commit -a -s
and format it as a patch.$ git format-patch HEAD~1
0001-staging-unisys-print-MAC-address-via-pM.patch
Next, we will use a wonderful script
get_maintainter.pl
to find out who needs to be informed personally.$ scripts/get_maintainer.pl --git-min-percent=67 --nor --norolestats 00*
Benjamin Romer
David Kershner
Greg Kroah-Hartman
sparmaintainer@unisys.com
devel@driverdev.osuosl.org
linux-kernel@vger.kernel.org
We send our patch to the addresses:
$ git send-email --cc-cmd 'scripts/get_maintainer.pl --git-min-percent=67 --nor --nol --norolestats' 00*
0001-staging-unisys-print-MAC-address-via-pM.patch
Who should the emails be sent to (if any)? devel@driverdev.osuosl.org, sparmaintainer@unisys.com
Message-ID to be used as In-Reply-To for the first email (if any)?
…
Send this email? ([y]es|[n]o|[q]uit|[a]ll): y
Here is the little letter .
UPDATE: Already in the kernel: 9a836c0a6310e6e9 .