Friday, August 26, 2016

Of Evil Accidentally Summoned by a Sorcerer's Disciples

Studying programming languages does take time and effort. But you can't avoid the thorny path if you are eager to thoroughly master the language, its principles, means and intricacies. C++ is no exception, and, moreover, is quite a representative example. There are numbers of nuances and subtleties about it that a programmer must know and keep in mind. But, as I've already said, you need time and practice.
Picture 1
Students take a bit different path in studying anything (including programming languages among other stuff). It's a frequent thing when they try to pick knowledge "in passing" because they are constantly short of time or think the material is not relevant for them or just because they are lazy. Sometimes it results in really funny incidents - and that's what we are going to talk about in this article.
Take some rest and enjoy reading about students' slip-ups.

Getting down to business

As material for this article, I've used files uploaded to Pastebin.com. Those are usually students' lab work tasks. There are mistakes to be found there, of course. We are going to discuss simply funny (in a certain way, I mean) code fragments. That is, the article is both entertaining for experienced programmers and educating for newcomers.
For our analysis, we used the PVS-Studio static code analyzer, so most code fragments will be accompanied by the quotations of the analyzer's diagnostic messages.

"Your porridge, Sir!"

Loops, loops, loops...

OK, enough talking, let's get down to business.
Take a look at the following piece of code:
void stampa_triangolo_rettangolo (int n)
{
  for (int i=0; i<n, i++;)
  {
    for (int j=0; j<n, j++;)
    {
      if (j<i)
        cout<<"  ";
      else
        cout<<"* ";
    }
    cout<<endl;
  }
}
PVS-Studio's diagnostic message: V521 Such expressions using the ',' operator are dangerous. Make sure the expression 'i < n, i ++' is correct. bllbrbpw.cpp 18
Have you noticed the catch? Fine. If no, let me explain - it's a simple one. For the loop to work right, you need to split the check operations into a loop termination condition and an increment. In this code, everything is messed up. The construct used by the author is syntactically correct but in fact no single loop iteration will be executed. The reason is that the loop termination condition is expressed by the 'i++' statement instead of 'i<n', as it should have been. Since the 'for' loop executes as long as the termination condition holds true, no iteration will be executed in this code because the condition proves false at the very first check.
It would have been much more interesting if the 'i' variable had initially equaled 1, or a preincrement operation (++i) had been used. Then the loop would have executed until the 'i' variable became equal to 0 (i.e. 'i' would have had to "walk through" the entire range of this type - both positive and negative).
The next funny sample:
int main()
{
  ....
  for (i = 0; i < 255; i++);
  {
    if (eldertext[i] = 'a'){}
  }
  ....
}
PVS-Studio's diagnostic message: V529 Odd semicolon ';' after 'for' operator. ryci4ba3.cpp 11
There are even 2 interesting issues here at once:
  • The loop. It will successfully walk through all the necessary iterations, but all in vain. The reason is the semicolon put in a wrong place. But even if it were put right, that wouldn't help solve the issue.
  • The condition. It has an assignment instead of comparison. And an empty body. No comment.
Let's go on:
int main()
{
  int i, j;
  ....
  for (i = 0; i < 4; i++)
  {
    for (j = 0; j < 5; i++)
    {
      scanf_s("\n%f", A[i][j]);
    }
    scanf_s("\n");
  };
  ....
}
PVS-Studio's diagnostic message: V533 It is likely that a wrong variable is being incremented inside the 'for' operator. Consider reviewing 'i'. fdpxaytc.cpp 16
Let's ignore the semicolon put for some unknown reason after the closing parenthesis in the end of the loop body and take a look at the nested loop instead. It's obvious that it is infinite and the error was caused by a typo. Instead of the 'j' variable, it is the 'i' variable which is incremented. It results in the 'j<5' condition never being fulfilled. There were a few other instances of this issue in the same file.
Going on with infinite loops:
Documento Escritorio::retiraDoc(string user1, string titulo1)
{
  ....
  unsigned int count = 0;
  ....
  while (count >= 0)
  { 
    it->retiraDoc();
    count--;
  }
  ....
}
PVS-Studio's diagnostic message: V547 Expression 'count >= 0' is always true. Unsigned type value is always >= 0. 5hyhalvq.cpp 34
It's doesn't even matter in this code whether or not the value of 'count' changes. To understand what the error is all about, look at the type of this variable - unsigned int. That is, the 'count' variable cannot be negative, therefore, when trying to decrement it when it equals 0, it will simply take the largest value possible. The result is an infinite loop.
And here's an opposite example - a loop that will never iterate even once:
Matrix()
{
  N = 0;
  matrix = new double*[N];
  for (int i = 0; i < N; i++)
  {
    matrix[i] = new double[N];
  }
}
PVS-Studio's diagnostic message: V621 Consider inspecting the 'for' operator. It's possible that the loop will be executed incorrectly or won't be executed at all. 6tx445ic.cpp 22
We are dealing with an interesting implementation of the default constructor. It surely must take one some effort to invent such a thing.

Pointers and memory handing

We are approaching the next mine field - pointers. It's quite an unpleasant subject for those studying the language "in passing". Take a look at a couple examples:
int main(....)
{
  ....
  int* p = 0;
  *p = 90;
  ....
}
PVS-Studio's diagnostic message: V522 Dereferencing of the null pointer 'p' might take place. 4ycv0zvb.cpp 10
I don't think you need any comments on this. A null pointer is created and as soon as in the next line it is dereferenced for the sake of a value writing attempt. Please don't do that. You can allocate dynamic memory, work with variables addresses - whatever; but please forget about this stuff. Sincerely yours, Captain Obvious.
Another example, a little more interesting one:
int main() 
{
  Test * t = nullptr;
  t -> hello(); 
  return 0;
}
PVS-Studio's diagnostic message: V522 Dereferencing of the null pointer 't' might take place. fafhwx5g.cpp 13
To make the whole picture clearer, here you are the declaration of the 'Test' class.
class Test 
{
  public:
    static void hello() 
    {
      std::cout << "Hello World!" << std::endl;   
    }
};
By the way, these code lines are the entire file. Quite a non-trivial way of getting the trivial 'Hello world!'
Memory handling errors are also frequent in students' works. Please remember that memory allocated by 'new' should be freed by 'delete', and for 'new[]' use 'delete[]'. Here is a code fragment where this rule was neglected:
char *getline()
{
  ....
  char * mtmp = new char[SIZE];
  ....
  delete mtmp;
  ....
}
PVS-Studio's diagnostic message: V611 The memory was allocated using 'new T[]' operator but was released using the 'delete' operator. Consider inspecting this code. It's probably better to use 'delete [] mtmp;'. mzxijddc.cpp 40
As you can see, memory is allocated with the help of the 'new[]' operator but freed with the help of the 'delete' operator, which causes undefined behavior. Notice that the 'new[]' and 'delete[]' operators are used in the right combination elsewhere in this file, which indicates that we are dealing with a typo in this particular case. C++ is the language where you should always be on the alert.
And here's an example of memory leak:
main()
{
  ....
  int *A=new int[n],*B=new int[n],t;
  ....
  delete[]  A,B;
  ....
}
PVS-Studio's diagnostic message: V680 The "delete A, B" expression only destroys the 'A' object. Then the ',' operator returns a resulting value from the right side of the expression. kdnaggkc.cpp 45
It's all clear from the message description - only the 'A' array will be deleted as the comma operator (',') is used here. That is, the delete line is equivalent to the following code:
(delete[] A), B;
A correct delete operation should look like this:
delete[] A;
delete[] B;
Incorrect usage of operators results in a memory leak. How large this leak will be depends on the size of the B array.
Here is an example of a potentially dangerous use of the 'realloc()' function:
Matrix& operator+ (Matrix& a, Matrix& b)
{
  ....
  res.matrix = (double**)realloc(res.matrix,sizeof(double*)*b.m);
  ....
}
PVS-Studio's diagnostic message: V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'res.matrix' is lost. Consider assigning realloc() to a temporary pointer. 7d7bnatk.cpp 79
Of course, there's a lot of other stuff to nag at besides 'realloc()' in this fragment... But we are not talking of this other stuff. You see, the function result is saved into the same variable where the address of the allocated memory block was being stored previously. If the memory block cannot be allocated, even if the data are moved, the 'realloc()' function will return a null pointer that will be written into the variable used previously to store the address of the allocated memory block. This is that very danger that may cause the memory leak. To avoid troubles like this, one should store a function result in a different variable.
Here's an example of checking 'this' for a null pointer:
struct AVLNode 
{
  ....
  int getHeight() 
  {
    return this == 0 ? 0 : height;
  }
  ....
};
PVS-Studio's diagnostic message: V704 'this == 0' expression should be avoided - this expression is always false on newer compilers, because 'this' pointer can never be NULL. ltxs3ntd.cpp 25
I could say a lot more about checking 'this' for being a null pointer but instead I recommend you to see some articles discussing this issue in detail: the first onethe second one.

Other bugs

Another example:
INT OutputArray(....)
{
  ....
  if (output[i + j] == 0x0D)
  {
    j = j;
  }
  ....
};
PVS-Studio's diagnostic message: V570 The 'j' variable is assigned to itself. chrmwjm9.cpp 277
Don't look at the semicolon after the function, let's check the branch of the 'if' operator instead. As you can see, the 'j' variable is assigned to itself. Most likely, there is a typo here and there must be the 'i' variable to the left or to the right of the '=' operator. Even if this issue doesn't lead to incorrect behavior, things like that must be avoided.
Picture 2
And as for the following function, I don't even feel like commenting on it. I guess its name ('fun') reflects what it is all about pretty well. Here's this "funny" code:
int fun(int p, int q)
{
    int a, b, s;
    s = 0;
    if( p <  0 ) 
    goto a1;
    if( p == 0 ) 
    goto a2;
    if( p >  0 ) 
    goto a3;
  a1: a = -p;
    goto a4;
  a2: a =  0;
    goto a4;
  a3: a = +p;
    goto a4;
  a4: p = a;
    if( q <  0 ) 
    goto b1;
    if( q == 0 ) 
    goto b2;
    if( q >  0 ) 
    goto b3;
  b1: b = -q;
    goto b4;
  b2: b =  0;
    goto b4;
  b3: b = +q;
    goto b4;
  b4: q = b;
  c1: if( a == 0 ) 
    goto c2;
    p = a / 10;
    p = p * 10;
    p = a - p;
    s = s + p;
    a = a / 10;
  c2: a = a;
    if( b == 0 ) 
    goto c3;
    q = b / 10;
    q = q * 10;
    q = b - q;
    s = s - q;
    b = b / 10;
  c3: b = b;
    if( a ) 
    goto c1;
    if( b ) 
    goto c1;
    return 
    s != 0;
}
And another nice piece:
int main() 
{
  ....
  char valinta = '1'; '2'; '3';
  ....
}
PVS-Studio's diagnostic message: V606 Ownerless token ''2''. l8xzvux7.cpp 12
The error is transparent. The question is, how could it be possible for anyone to make such a typo at all (though it doesn't look quite like that, really), or how did the author intend to use this code? I don't know.
Here is another similar fragment but this time the typo is obvious:
bool operator < (const Key &lhs, const Key &rhs)
{
  if(....) 
    return true;
  else if(....) 
    return true;
  else if(....) 
    return true;
  else false;
}
PVS-Studio's diagnostic message: V606 Ownerless token 'false'. 662eljcq.cpp 31
The error is actually similar to the previous one, but the typo is clear and trivial in this case ('return' missing before 'false').
Not once I came across code fragments like this:
int main (void)
{
  int a;
  short b;
  long c;
  printf("Ausgabe der Speicheradressen:");
  printf("\n----------------------------:");
  printf("\n\nVariable 1(d): %d", &a);
  printf("\n\nVariable 1(p): %p", a);
  printf("\nVariable 2(d):  %d", &b);
  printf("\nVariable 2(p):  %p", b);
  printf("\nVariable 3(d):  %d", &c);
  printf("\nVariable 3(p):  %p", c);
  printf("\n\n");
  system("pause");
}
One example of PVS-Studio's diagnostic messages: V576 Incorrect format. Consider checking the second actual argument of the 'printf' function. The pointer is expected as an argument. j38r7dqb.cpp 16
The error is about discrepancy between the format string and actual arguments passed into the function. It results in the program's undefined behavior - for example printing some meaningless values.

Conclusion

These are of course not all the mistakes from the files we have analyzed yet they are probably the most interesting ones. I hope you have learned something new from this article and expanded your knowledge - as they say, "you learn as long as you live".

No comments:

Post a Comment