Since the release of the publicly available Linux-version of PVS-Studio, it was just a matter of time until we would recheck the Linux kernel. It is quite a challenge for any static code analyzer to check a project written by professionals from all around the world, used by people in various fields, which is regularly checked and tested by different tools. So, what errors did we manage to find in such conditions?
How we did the check
We have already checked the Linux kernel. A lot of things have changed since then - now it's equally easy to check an OS, or any other project.
pvs-studio-analyzer trace - make
pvs-studio-analyzer analyze -o /path/to/report.log -j8
It took us just a few months to adapt and test the analyzer in Linux; which was previously available only for Windows. This time it was much easier to check the project.
We used PVS-Studio version 4.9-rc4 (commit bc33b0ca11e3df467777a4fa7639ba488c9d4911).
In this article we will cover only the general analysis diagnostics of levels one and two. We should note that the code really is of very high quality - the density of warnings pointing to real flaws is extremely low.
I have chosen warnings that most likely pointed to real bugs/errors. We should understand that besides useful warnings, the analyzer may issue false positives, or badly formatted code or 'code smell'.
Unfortunately, the number of false positives in the Linux version of PVS-Studio is higher than we would like it to be. I think this is due to the fact that this version is still quite young. We have done a lot, and still continue working to minimize the number of false positives. The Linux code helped us become better, and add a good number of useful modifications to the analyzer - and now we would like to answer back.
Typos
The most common errors are caused by the usual typos and Copy-Paste errors. If you have read our articles before, you've probably noticed that. They are everywhere: in all operating systems, and in any language. Still, they are a great example for showing the potential of a static code analyzer: it's much harder to find them with other tools. Let's see what we have in the Linux kernel:
PVS-Studio warning: V581 The conditional expressions of the 'if' operators situated alongside each other are identical. Check lines: 2384, 2390. debug.c 2390
int dbg_check_nondata_nodes_order(....)
{
....
sa = container_of(cur, struct ubifs_scan_node, list);
sb = container_of(cur->next, struct ubifs_scan_node, list);
if (sa->type != UBIFS_INO_NODE && sa->type != UBIFS_DENT_NODE &&
sa->type != UBIFS_XENT_NODE) {
ubifs_err(c, "bad node type %d", sa->type);
ubifs_dump_node(c, sa->node);
return -EINVAL;
}
if (sa->type != UBIFS_INO_NODE && sa->type != UBIFS_DENT_NODE &&
sa->type != UBIFS_XENT_NODE) {
ubifs_err(c, "bad node type %d", sb->type);
ubifs_dump_node(c, sb->node);
return -EINVAL;
}
....
}
The analyzer complains about two similar conditions in a row: perhaps in the second condition the programmer forgot to change sa to sb. Who says that people in cool projects don't copy-paste?
PVS-Studio warning: V666 Consider inspecting third argument of the function 'strncmp'. It is possible that the value does not correspond with the length of a string which was passed with the first argument. spectral.c 341
static ssize_t write_file_spec_scan_ctl(struct file *file,
const char __user *user_buf,
size_t count, loff_t *ppos)
{
struct ath10k *ar = file->private_data;
char buf[32];
ssize_t len;
int res;
len = min(count, sizeof(buf) - 1);
if (copy_from_user(buf, user_buf, len))
return -EFAULT;
buf[len] = '\0';
mutex_lock(&ar->conf_mutex);
if (strncmp("trigger", buf, 7) == 0) {
....
} else if (strncmp("background", buf, 9) == 0) {
res = ath10k_spectral_scan_config(ar, SPECTRAL_BACKGROUND);
} else if (strncmp("manual", buf, 6) == 0) {
res = ath10k_spectral_scan_config(ar, SPECTRAL_MANUAL);
} else if (strncmp("disable", buf, 7) == 0) {
res = ath10k_spectral_scan_config(ar, SPECTRAL_DISABLED);
} else {
res = -EINVAL;
}
mutex_unlock(&ar->conf_mutex);
if (res < 0)
return res;
return count;
}
A classic error: two arguments should be passed to a function: a pointer to a string and its length. Quite often, when a literal serves as an argument, programmers are too lazy to evaluate the length, and just write a number. The human factor in action: people get things wrong very often.
See, there are several strncmp in a row. The literal is passed to each of them. Also, in the strncmp("background", buf, 9) the length was evaluated incorrectly: the word "background" has 10, not 9, characters.
PVS-Studio warning: V666 Consider inspecting third argument of the function 'memcpy'. It is possible that the value does not correspond with the length of a string which was passed with the second argument. dpt_i2o.c 403
static void adpt_inquiry(adpt_hba* pHba)
{
....
memset(pHba->detail, 0, sizeof(pHba->detail));
memcpy(&(pHba->detail), "Vendor: Adaptec ", 16);
memcpy(&(pHba->detail[16]), " Model: ", 8);
memcpy(&(pHba->detail[24]), (u8*) &buf[16], 16);
memcpy(&(pHba->detail[40]), " FW: ", 4); // <=
memcpy(&(pHba->detail[44]), (u8*) &buf[32], 4);
pHba->detail[48] = '\0'; /* precautionary */
....
}
One more example. The length of the string "FW" is 5, not 4, characters.
How can we get rid of such an error? In C, you can use a macro like this:
#define str_len(S) (sizeof(S) / sizeof((S)[0]))
Using such macros is dangerous: it's better to add compiler-specific checks, to see that the passed argument is an array.
For our readers coding in C++, I can recommend the std::string_view that finally appeared in C++17. It is better not to pass a pointer-length to the function as a pair. But if it's necessary to evaluate the array size manually (if we need to pass it to the memcpy function), we can use std::size(array) or its equivalent: the size for literals will be evaluated in compile-time.
Avoid repeating the code, and do not be too lazy to use language tools (macros or templates) for the compile time evaluations!
PVS-Studio warning: V653 A suspicious string consisting of two parts is used for array initialization. It is possible that a comma is missing. Consider inspecting this literal: "30min" "No timeout". lp8788-charger.c 657
static ssize_t lp8788_show_eoc_time(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct lp8788_charger *pchg = dev_get_drvdata(dev);
char *stime[] = { "400ms", "5min", "10min", "15min",
"20min", "25min", "30min" "No timeout" };
....
}
As is known, the two consecutive literals get linked. This allows them to be used easily in macros for instance. The danger appears when we write an array of such literals: you can miss a comma, and get an unexpected result.
In this case two last literals will "stick" to each other, and we will get "30minNo timeout". It's a double error. Firstly, the text is incorrect; secondly, the array will be missing one element, which can lead to access out of bounds.
I recommend using another form of formatting, so that this error will become more evident:
char *stime[] = {
"400ms"
, "5min"
, "10min"
, "15min"
, "20min"
, "25min"
, "30min"
"No timeout"
};
Learn more about this method of table-style formatting in a book written by my colleague, Andrey Karpov. I suggest reading Chapter N13.
PVS-Studio warning: V764 Possible incorrect order of arguments passed to 'ahc_9005_subdevinfo_valid' function: 'device' and 'vendor'. aic7xxx_pci.c 695
const struct ahc_pci_identity *
ahc_find_pci_device(ahc_dev_softc_t pci)
{
....
if (ahc_get_pci_function(pci) > 0
&& ahc_9005_subdevinfo_valid(device, vendor, // <=
subdevice, subvendor)
&& SUBID_9005_MFUNCENB(subdevice) == 0)
return (NULL);
....
}
Sometimes it's hard to understand what the analyzer is complaining about. By the way, it often happens that a person does not understand what the analyzer warns about, sends a report with a 'false positive', and it turns out that there's actually a bug. It also seemed to me that this was a false positive: the function is defined a little earlier in the code, and all the parameters are set correctly. Here's how it looks:
static int
ahc_9005_subdevinfo_valid(uint16_t device, uint16_t vendor,
uint16_t subdevice, uint16_t subvendor)
{
....
}
What's the matter here? It turns out that a bit earlier this function was declared, and it is there that the arguments are confused. In fact there is nothing dangerous in the program logic, but it's better to fix this, so as not to confuse other developers.
static int ahc_9005_subdevinfo_valid(uint16_t vendor, uint16_t device,
uint16_t subvendor, uint16_t subdevice);
The funny thing is that this error has already been in the code: the parameters were confused, the programmer just forgot to modify the declaration. It's good that the analyzer detected this fragment.
PVS-Studio warning: V549 The first argument of 'memcpy' function is equal to the second argument. wilc_wfi_cfgoperations.c 1345
static int del_pmksa(struct wiphy *wiphy,
struct net_device *netdev,
struct cfg80211_pmksa *pmksa)
{
....
for (; i < (priv->pmkid_list.numpmkid - 1); i++) {
memcpy(priv->pmkid_list.pmkidlist[i].bssid,
priv->pmkid_list.pmkidlist[i + 1].bssid,
ETH_ALEN);
memcpy(priv->pmkid_list.pmkidlist[i].pmkid,
priv->pmkid_list.pmkidlist[i].pmkid,
PMKID_LEN);
}
....
}
In the last memcpy the pointers are the same. Perhaps the programmer wanted to copy the previous expression:
memcpy(priv->pmkid_list.pmkidlist[i].pmkid,
priv->pmkid_list.pmkidlist[i + 1].pmkid,
PMKID_LEN);
Unused variables
PVS-Studio warning: V575 The 'strncasecmp' function processes '0' elements. Inspect the third argument. linux_wlan.c 1121
static int mac_ioctl(struct net_device *ndev,
struct ifreq *req,
int cmd)
{
u8 *buff = NULL;
s8 rssi;
u32 size = 0, length = 0;
struct wilc_vif *vif;
s32 ret = 0;
struct wilc *wilc;
vif = netdev_priv(ndev);
wilc = vif->wilc;
if (!wilc->initialized)
return 0;
switch (cmd) {
case SIOCSIWPRIV:
{
struct iwreq *wrq = (struct iwreq *)req;
size = wrq->u.data.length;
if (size && wrq->u.data.pointer) {
buff = memdup_user(wrq->u.data.pointer,
wrq->u.data.length);
if (IS_ERR(buff))
return PTR_ERR(buff);
if (strncasecmp(buff, "RSSI", length) == 0) { // <=
....
}
}
}
....
}
done:
kfree(buff);
return ret;
}
0 was passed as an argument to the strncasecmp function. There is no fragment where the length variable gets changed, so, it's value remains zero. size should probably be used instead.
PVS-Studio warning: V751 Parameter 'LCDheight' is not used inside function body. init.c 339
static
unsigned short
SiS_GetModeID(int VGAEngine, unsigned int VBFlags,
int HDisplay, int VDisplay,
int Depth, bool FSTN,
int LCDwidth, int LCDheight)
{
unsigned short ModeIndex = 0;
switch(HDisplay)
{
case 320:
if(VDisplay == 200) ModeIndex = ModeIndex_320x200[Depth];
else if(VDisplay == 240) {
if((VBFlags & CRT2_LCD) && (FSTN))
ModeIndex = ModeIndex_320x240_FSTN[Depth];
else
ModeIndex = ModeIndex_320x240[Depth];
}
break;
case 400:
if((!(VBFlags & CRT1_LCDA)) ||
((LCDwidth >= 800) && (LCDwidth >= 600))) { // <=
if(VDisplay == 300) ModeIndex = ModeIndex_400x300[Depth];
}
break;
case 512:
if((!(VBFlags & CRT1_LCDA)) ||
((LCDwidth >= 1024) && (LCDwidth >= 768))) { // <=
if(VDisplay == 384) ModeIndex = ModeIndex_512x384[Depth];
}
break;
....
}
return ModeIndex;
}
An unused parameter in the function isn't always an error. In old APIs there are situations where a parameter is not needed, and it is either rewritten, or simply not used. But if you take a closer look at this fragment, you will see that the programmer forgot to compare the height. Instead, we see comparisons '(A > 5) && (A > 3)' that are redundant by themselves.
Confusion in the operation precedence
PVS-Studio warning: V502 Perhaps the '?:' operator works in a different way than it was expected. The '?:' operator has a lower priority than the '|' operator. core.c 1046
static int nvme_pr_preempt(struct block_device *bdev,
u64 old, u64 new,
enum pr_type type, bool abort)
{
u32 cdw10 = nvme_pr_type(type) << 8 | abort ? 2 : 1;
return nvme_pr_command(bdev, cdw10, old, new,
nvme_cmd_resv_acquire);
}
A ternary operator in C is a very dangerous operator. One of the first general analysis diagnostics in PVS-Studio is about this for a reason. The thing is that it has a very low priority and it's very easy to get confused and get a totally different evaluation order. So, when in doubt, it's better to use parentheses.
Suspicious checks
PVS-Studio warning: V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 375, 377. trx.c 375
bool rtl92ee_rx_query_desc(struct ieee80211_hw *hw,
struct rtl_stats *status,
struct ieee80211_rx_status *rx_status,
u8 *pdesc, struct sk_buff *skb)
{
struct rtl_priv *rtlpriv = rtl_priv(hw);
struct rx_fwinfo *p_drvinfo;
struct ieee80211_hdr *hdr;
u32 phystatus = GET_RX_DESC_PHYST(pdesc);
....
status->macid = GET_RX_DESC_MACID(pdesc);
if (GET_RX_STATUS_DESC_MAGIC_MATCH(pdesc))
status->wake_match = BIT(2);
else if (GET_RX_STATUS_DESC_MAGIC_MATCH(pdesc))
status->wake_match = BIT(1);
else if (GET_RX_STATUS_DESC_UNICAST_MATCH(pdesc))
status->wake_match = BIT(0);
else
status->wake_match = 0;
....
}
At first glance it's not easy to see what's wrong. The same check by the macro GET_RX_STATUS_DESC_MAGIC_MATCH is done twice. If we see its declaration, we'll see another two macros:
#define GET_RX_STATUS_DESC_PATTERN_MATCH(__pdesc)
LE_BITS_TO_4BYTE(__pdesc+12, 29, 1)
#define GET_RX_STATUS_DESC_UNICAST_MATCH(__pdesc)
LE_BITS_TO_4BYTE(__pdesc+12, 30, 1)
#define GET_RX_STATUS_DESC_MAGIC_MATCH(__pdesc)
LE_BITS_TO_4BYTE(__pdesc+12, 31, 1)
Perhaps the programmer meant to use GET_RX_STATUS_DESC_PATTERN_MATCH, which is missing in the original fragment. Otherwise this check simply makes no sense.
PVS-Studio warning: V547 Expression '(ptr[3] & 0x1E) != 0x03' is always true. sd.c 4115
int ext_sd_send_cmd_get_rsp(struct rtsx_chip *chip,
u8 cmd_idx, u32 arg, u8 rsp_type,
u8 *rsp, int rsp_len, bool special_check)
{
int retval;
int timeout = 100;
u16 reg_addr;
u8 *ptr;
....
if (cmd_idx == SELECT_CARD) {
if (rsp_type == SD_RSP_TYPE_R2) {
if ((ptr[3] & 0x1E) != 0x04) {
rtsx_trace(chip);
return STATUS_FAIL;
}
} else if (rsp_type == SD_RSP_TYPE_R0) {
if ((ptr[3] & 0x1E) != 0x03) { // <=
rtsx_trace(chip);
return STATUS_FAIL;
}
}
}
....
}
The error is related to the usage of bitwise operations. The result of a bitwise conjunction with 0x1E will never be 0x03 because of one bit:
PVS-Studio warning: V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 1277, 1282. ks_wlan_net.c 1277
static int ks_wlan_set_power(struct net_device *dev,
struct iw_request_info *info,
struct iw_param *vwrq, char *extra)
{
struct ks_wlan_private *priv =
(struct ks_wlan_private *)netdev_priv(dev);
short enabled;
if (priv->sleep_mode == SLP_SLEEP) {
return -EPERM;
}
/* for SLEEP MODE */
enabled = vwrq->disabled ? 0 : 1;
if (enabled == 0) { /* 0 */
priv->reg.powermgt = POWMGT_ACTIVE_MODE;
} else if (enabled) { /* 1 */
if (priv->reg.operation_mode == MODE_INFRASTRUCTURE)
priv->reg.powermgt = POWMGT_SAVE1_MODE;
else
return -EINVAL;
} else if (enabled) { /* 2 */
if (priv->reg.operation_mode == MODE_INFRASTRUCTURE)
priv->reg.powermgt = POWMGT_SAVE2_MODE;
else
return -EINVAL;
} else
return -EINVAL;
hostif_sme_enqueue(priv, SME_POW_MNGMT_REQUEST);
return 0;
}
Let's shorten the example to:
enabled = vwrq->disabled ? 0 : 1;
if (enabled == 0) { /* 0 */
....
} else if (enabled) { /* 1 */
....
} else if (enabled) { /* 2 */
....
} else
....
This code looks very strange. We see that the value range is clearly defined in the expression above: enabled is either 0 or 1. However, 4 values get checked. At the same time, the comments only bring confusion: if the numbers meant to define a possible value of a variable, then it's not what we have now: the checks for 1 and 2are written in the same way.
PVS-Studio warning: V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 422, 424. Hal8188ERateAdaptive.c 422
static int odm_ARFBRefresh_8188E(
struct odm_dm_struct *dm_odm,
struct odm_ra_info *pRaInfo)
{ /* Wilson 2011/10/26 */
....
if (pRaInfo->HighestRate > 0x13)
pRaInfo->PTModeSS = 3;
else if (pRaInfo->HighestRate > 0x0b)
pRaInfo->PTModeSS = 2;
else if (pRaInfo->HighestRate > 0x0b)
pRaInfo->PTModeSS = 1;
else
pRaInfo->PTModeSS = 0;
....
return 0;
}
Another fragment where we see two consecutive conditions. Note that at the same time, they have different bodies. It's hard to say whether this is a real bug, or this code is just not used; this is a task for developers. The task of the analyzer is to point to a suspicious place.
PVS-Studio warning: V734 An excessive check. Examine the conditions containing search for the substrings "interleaver" and "deinterleaver". sst-atom-controls.c 1449
static int sst_fill_widget_module_info(
struct snd_soc_dapm_widget *w,
struct snd_soc_platform *platform)
{
struct snd_kcontrol *kctl;
int index, ret = 0;
struct snd_card *card = platform->component.card->snd_card;
char *idx;
down_read(&card->controls_rwsem);
list_for_each_entry(kctl, &card->controls, list) {
....
} else if (strstr(kctl->id.name, "interleaver")) {
struct sst_enum *e = (void *)kctl->private_value;
e->w = w;
} else if (strstr(kctl->id.name, "deinterleaver")) {
struct sst_enum *e = (void *)kctl->private_value;
e->w = w;
}
....
}
up_read(&card->controls_rwsem);
return 0;
}
In this fragment, the programmer checks the presence of several substrings in one string. To make it more evident, I left only those substrings which are interesting to us. Let's suppose we haven't found interleaver - then there is no point in looking for deinterleaver, because there is definitely no interleaver substring. Therefore, this code fragment will never work; but since the if and else bodies are the same, it's not dangerous. This code is simply redundant.
PVS-Studio warning: V547 Expression 'block' is always true. svclock.c 873
void
nlmsvc_grant_reply(struct nlm_cookie *cookie, __be32 status)
{
struct nlm_block *block;
dprintk("grant_reply: looking for cookie %x, s=%d \n",
*(unsigned int *)(cookie->data), status);
if (!(block = nlmsvc_find_block(cookie)))
return;
if (block) {
if (status == nlm_lck_denied_grace_period) {
/* Try again in a couple of seconds */
nlmsvc_insert_block(block, 10 * HZ);
} else {
/* Lock is now held by client, or has been rejected.
* In both cases, the block should be removed. */
nlmsvc_unlink_block(block);
}
}
nlmsvc_release_block(block);
}
This example demonstrates why it's not enough for a static code analyzer to perform pattern-based, traversing AST. It's important to be able to perform control flow analysis and data flow analysis as well. At the moment when block == NULL, we have return, so we can say that the pointer is not null for sure. So, when we see a NULL check, we understand that there is something wrong here.
Apparently, the second check is just not necessary here. But what if the programmer wanted to check a different variable? Who knows ...? This code should be reviewed by the developer.
A similar situation:
PVS-Studio warning: V547 Expression 'sym' is always true. menu.c 498
bool menu_is_visible(struct menu *menu)
{
struct menu *child;
struct symbol *sym;
....
if (!sym || sym_get_tristate_value(menu->sym) == no) // <=
return false;
for (child = menu->list; child; child = child->next) {
if (menu_is_visible(child)) {
if (sym) // <=
sym->flags |= SYMBOL_DEF_USER;
return true;
}
}
return false;
}
An erron in a macro
PVS-Studio warning: V733 It is possible that macro expansion resulted in incorrect evaluation order. Check expression: request->rq_timeout + 5 * 1000. niobuf.c 637
#define CFS_FAIL_TIMEOUT(id, secs) \
cfs_fail_timeout_set(id, 0, secs * 1000, CFS_FAIL_LOC_NOSET)
#define OBD_FAIL_TIMEOUT(id, secs) \
CFS_FAIL_TIMEOUT(id, secs)
int ptl_send_rpc(struct ptlrpc_request *request, int noreply)
{
....
OBD_FAIL_TIMEOUT(OBD_FAIL_PTLRPC_DELAY_SEND,
request->rq_timeout + 5);
....
}
These errors are very rare. I have previously seen this warning only once in a real project: interestingly enough, it was FreeBSD. There, the error was in the macro: it's better to enclose the parameters in brackets. If this is not done, then such a situation is possible: when using 'x + 5' in 'secs * 1000', we get 'x + 5 * 1000'; obviously, this is not what the author expected.
Meaningless memset
PVS-Studio warning: V597 The compiler could delete the 'memset' function call, which is used to flush 'ps' buffer. The memset_s() function should be used to erase the private data. atom.c 1383
int amdgpu_atom_asic_init(struct atom_context *ctx)
{
int hwi = CU16(ctx->data_table + ATOM_DATA_FWI_PTR);
uint32_t ps[16];
int ret;
memset(ps, 0, 64);
ps[0] = cpu_to_le32(CU32(hwi + ATOM_FWI_DEFSCLK_PTR));
ps[1] = cpu_to_le32(CU32(hwi + ATOM_FWI_DEFMCLK_PTR));
if (!ps[0] || !ps[1])
return 1;
if (!CU16(ctx->cmd_table + 4 + 2 * ATOM_CMD_INIT))
return 1;
ret = amdgpu_atom_execute_table(ctx, ATOM_CMD_INIT, ps);
if (ret)
return ret;
memset(ps, 0, 64); // <=
return ret;
}
It makes no sense to add memset before return: the compiler, seeing that this operation does not change the visible state of a program (an array still beyond the scope), will remove it. If it is necessary to erase some important data, then use memset_s, or write your own equivalent.
This error by the way, is actually a vulnerability. Some data that should be erased, doesn't get removed. More details can be found in the description of the V597diagnostic. Actually, this is a very common vulnerability: proof
Dangerous use of memcmp
PVS-Studio warning: V642 Saving the 'memcmp' function result inside the 'unsigned char' type variable is inappropriate. The significant bits could be lost breaking the program's logic. host.c 1789
static void power_control_timeout(unsigned long data)
{
....
u8 other = memcmp(requester->frame_rcvd.iaf.sas_addr,
iphy->frame_rcvd.iaf.sas_addr,
sizeof(requester->frame_rcvd.iaf.sas_addr));
if (other == 0) {
....
}
....
}
If we read carefully what the documentation says about the return value of memcmp, we'll see that there is no guarantee about the value range: the function can return any number within the scope of its type. And it's not always -1, 0, and 1. Therefore, you cannot save the value in a variable of a smaller type: If there is a loss of higher bits, the lower ones can be zero. A similar mistake caused several vulnerabilities in MySQL/MariaDB.
Conclusion
As was already mentioned, Linux is a very high quality and well tested project. To find a bug, even the most insignificant one - is already good enough reason to be proud. It's also grounds to wonder how many errors could be found before debugging and testing: the static analyzer is especially useful in this regard. You can see this by trying PVS-Studio. You may get a trial version of the Linux analyzer if you e-mail us. In case you have a non-commercial project, you can use PVS-Studio for free: just read this article, and use our open and free utility how-to-use-pvs-studio-free.
No comments:
Post a Comment