Monday, May 22, 2017

The Evil within the Comparison Functions

The Evil within the Comparison Functions

Perhaps, readers remember my article titled "Last line effect". It describes a pattern I've once noticed: in most cases programmers make an error in the last line of similar text blocks. Now I want to tell you about a new interesting observation. It turns out that programmers tend to make mistakes in functions comparing two objects. This statement looks implausible; however, I'll show you a great number of examples of errors that may be shocking to a reader. So, here is a new research, it will be quite amusing and scary.

Picture 2

Problematics

Here is my statement: programmers quite often make mistakes in rather simple functions that are meant to compare two objects. This claim is based on the experience of our team in checking a large number of open source projects in C, C++ and C#.

The functions we are going to consider here are IsEqual, Equals, Compare, AreEqual and so on or overloaded operators as ==, !=.

I noticed that when writing articles, very often I come across errors related to the comparison functions. I decided to explore this question in detail and examined the base of errors we found. I did a search of functions throughout the base containing words Cmp, Equal, Compare and such. The result was very impressive and shocking.

In fact this story is similar to the one we had when writing the article "Last line effect". Similarly, I noticed an anomaly and decided to explore it more carefully. Unfortunately, unlike the aforementioned article, I don't know how to bring statistics here and which figures to provide. Perhaps, later I'll come up with a solution with the statistics. At this point I am guided by intuition and can only share my feelings. They see that there are a lot of errors in the comparison functions and I am sure, you will get the same feeling when you see that huge amount of truly impressive examples.

Psychology

For a moment let's go back to the article "Last line effect". By the way, if you haven't read it, I suggest taking a break and looking at it. There is a more detailed analysis of this topic: "The last line effect explained"

In general, we can conclude that the cause of the errors in the last lined is related to the fact that the developer has already mentally moved to the new lines/tasks instead of focusing on the completion of the current fragment. As a result - when writing similar blocks of text, there is a higher probability that a programmer will make an error in the last one.

I believe that in the case of writing a comparison function, a developer in general often don't focus on it, considering it to be too trivial. In other words, he writes the code automatically, without thinking over it. Otherwise, it is not clear how one can make an error like this:

bool IsLuidsEqual(LUID luid1, LUID luid2)
{
  return (luid1.LowPart == luid2.LowPart) &&
         (luid2.HighPart == luid2.HighPart);
}

PVS-Studio analyzer detected this error in the code of RunAsAdmin Explorer Shim (C++) project: V501 There are identical sub-expressions to the left and to the right of the '==' operator: luid2.HighPart == luid2.HighPart RAACommon raacommonfuncs.cpp 1511

A typo. In the second line it should be: luid1.HighPart == luid2.HighPart.

The code is very simple. Apparently, the simplicity of code spoils everything. A programmer immediately thinks of the task to write such a function as standard and uninteresting. He instantly thinks of the way to write the function and he has just to implement the code. This is a routine, but unfortunately an inevitable process to start writing more important, complex and interesting code. He is already thinking about the new task... and as a result - makes an error.

In addition, programmers rarely write unit tests for such functions. Again the simplicity of these functions prevents from it. It seems that it would be too much to test them, as these functions are simple and repetitive. A person has written hundreds of such functions in his life, can he make an error in another function? Yes, he can and he does.

I would also like to note that we aren't talking about code of students who are just learning to program. We are talking about bugs in the code of such projects as GCC, Qt, GDB, LibreOffice, Unreal Engine, CryEngine 4 V Chromium, MongoDB, Oracle VM Virtual Box, FreeBSD, WinMerge, the CoreCLR, MySQL, Mono, CoreFX, Roslyn, MSBuild, etc. It's all very serious.

We are going to have a look at so many diverse examples that it would be scary to sleep at night.

Erroneous Patterns in Comparison Functions

All errors in comparison functions will be divided into several patterns. In the article we'll be talking about errors in projects in C, C++ and C#, but it makes no sense to separate these languages, as most of the patterns are similar for different languages.

Pattern: A < B, B > A

Very often in the comparison functions there is a need to make such checks:

  • A < B
  • A > B

Sometimes programmers think that is more elegant to use the same operator <, but to switch the variables.

  • A < B
  • B < A

However, due to the inattentiveness, we get such checks:

  • A < B
  • B > A

In fact, one and the same comparison is done twice here. Perhaps, it's not clear what it is about here, but we'll get to the practical examples and it'll all become clearer.

string _server;
....
bool operator<( const ServerAndQuery& other ) const {
  if ( ! _orderObject.isEmpty() )
    return _orderObject.woCompare( other._orderObject ) < 0;

  if ( _server < other._server )
    return true;
  if ( other._server > _server )
    return false;
  return _extra.woCompare( other._extra ) < 0;
}

PVS-Studio analyzer detected this error in the code of MongoDB (C++): V581 The conditional expressions of the 'if' operators situated alongside each other are identical. Check lines: 44, 46. parallel.h 46

This condition:

if ( other._server > _server )

Will always be false, as the same check was done two lines before. Correct code variant:

if ( _server < other._server )
  return true; 
if ( other._server < _server )
  return false;

This error was detected in the code of Chromium project (C++):

enum ContentSettingsType;
struct EntryMapKey {
  ContentSettingsType content_type;
  ...
};

bool OriginIdentifierValueMap::EntryMapKey::operator<(
    const OriginIdentifierValueMap::EntryMapKey& other) const {
  if (content_type < other.content_type)
    return true;
  else if (other.content_type > content_type)
    return false;
  return (resource_identifier < other.resource_identifier);
}

PVS-Studio warning: V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 61, 63. browser content_settings_origin_identifier_value_map.cc 61

That was a C++ example, now it's C# turn. The next error was found in the code of IronPython and IronRuby (C#).

public static int Compare(SourceLocation left,
                          SourceLocation right) {
  if (left < right) return -1;
  if (right > left) return 1;
  return 0;
}

PVS-Studio warning (C#): V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless. SourceLocation.cs 156

I think there is no need in explanation.

Note. For C# there was just one example of an error, but for C++ - two. In general, there will be less bugs in the C# code, than for C/C++. But I do not recommend rushing to the conclusion that C# is much safer. The thing is that PVS-Studio analyzer has only recently learned to check C# code relatively recently, and we have just checked less projects written in C#, than in C and C++.

Pattern: a Member of the Class is Compared with itself

The comparison functions usually consist of successive comparisons of structure/class members. This code tends to be more erronreous, when the member of the class starts being compared with itself. I can specify two subtypes of errors.

In the first case, a programmer forgets to specify the name of the object and writes in the following way:

return m_x == foo.m_x &&
       m_y == m_y &&            // <=
       m_z == foo.m_z;
In the second case, the same name of the object is written.
return zzz.m_x == foo.m_x &&
       zzz.m_y == zzz.m_y &&    // <=
       zzz.m_z == foo.m_z;

Let's take a closer look at practical examples of this pattern. Pay attention that incorrect comparison often occurs in the last block of similar code blocks, which reminds us of the "last line effect" again.

The error is found in the code of Unreal Engine 4 (C++) project:

bool
Compare(const FPooledRenderTargetDesc& rhs, bool bExact) const
{
  ....
  return Extent == rhs.Extent
    && Depth == rhs.Depth
    && bIsArray == rhs.bIsArray
    && ArraySize == rhs.ArraySize
    && NumMips == rhs.NumMips
    && NumSamples == rhs.NumSamples
    && Format == rhs.Format
    && LhsFlags == RhsFlags
    && TargetableFlags == rhs.TargetableFlags
    && bForceSeparateTargetAndShaderResource ==
         rhs.bForceSeparateTargetAndShaderResource
    && ClearValue == rhs.ClearValue
    && AutoWritable == AutoWritable;           // <=
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '==' operator: AutoWritable == AutoWritable rendererinterface.h 180

The code of Samba (C) project:

static int compare_procids(const void *p1, const void *p2)
{
  const struct server_id *i1 = (struct server_id *)p1;
  const struct server_id *i2 = (struct server_id *)p2;

  if (i1->pid < i2->pid) return -1;
  if (i2->pid > i2->pid) return 1;
  return 0;
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '>' operator: i2->pid > i2->pid brlock.c 1901

The code of MongoDB (C++) project:

bool operator==(const MemberCfg& r) const {
  ....
  return _id==r._id && votes == r.votes &&
         h == r.h && priority == r.priority &&
         arbiterOnly == r.arbiterOnly &&
         slaveDelay == r.slaveDelay &&
         hidden == r.hidden &&
         buildIndexes == buildIndexes;        // <=
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '==' operator: buildIndexes == buildIndexes rs_config.h 101

The code of Geant4 Software (C++) project:

inline G4bool G4FermiIntegerPartition::
operator==(const G4FermiIntegerPartition& right)
{
  return (total == right.total &&
          enableNull == enableNull &&          // <=
          partition == right.partition);
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '==' operator: enableNull == enableNull G4hadronic_deex_fermi_breakup g4fermiintegerpartition.icc 58

The code of LibreOffice (C++) project:

class SvgGradientEntry
{
  ....
  bool operator==(const SvgGradientEntry& rCompare) const
  {
    return (getOffset() == rCompare.getOffset()
           && getColor() == getColor()            // <=
           && getOpacity() == getOpacity());      // <=
  }
  ....
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '==' operator: getColor() == getColor() svggradientprimitive2d.hxx 61

The code of Chromium (C++) project:

bool FileIOTest::MatchesResult(const TestStep& a,
                               const TestStep& b) {
  ....
  return (a.data_size == a.data_size &&             // <=
          std::equal(a.data, a.data + a.data_size, b.data));
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '==' operator: a.data_size == a.data_size cdm_file_io_test.cc 367

The code of FreeCAD (C++) project:

bool FaceTypedBSpline::isEqual(const TopoDS_Face &faceOne,
                               const TopoDS_Face &faceTwo) const
{
  ....
  if (surfaceOne->IsURational() != 
      surfaceTwo->IsURational())
    return false;
  if (surfaceTwo->IsVRational() !=         // <= 
      surfaceTwo->IsVRational())           // <=
    return false;
  if (surfaceOne->IsUPeriodic() != 
      surfaceTwo->IsUPeriodic())
    return false;
  if (surfaceOne->IsVPeriodic() != 
      surfaceTwo->IsVPeriodic())
    return false;
  if (surfaceOne->IsUClosed() != 
      surfaceTwo->IsUClosed())
    return false;
  if (surfaceOne->IsVClosed() != 
      surfaceTwo->IsVClosed())
    return false;
  if (surfaceOne->UDegree() != 
      surfaceTwo->UDegree())
    return false;
  if (surfaceOne->VDegree() != 
      surfaceTwo->VDegree())
    return false;
  ....
}

PVS-Studio warning: V501 There are identical sub-expressions 'surfaceTwo->IsVRational()' to the left and to the right of the '!=' operator. modelrefine.cpp 780

The code of Serious Engine (C++) project:

class CTexParams {
public:

  inline BOOL IsEqual( CTexParams tp) {
    return tp_iFilter     == tp.tp_iFilter &&
           tp_iAnisotropy == tp_iAnisotropy &&             // <=
           tp_eWrapU      == tp.tp_eWrapU &&
           tp_eWrapV      == tp.tp_eWrapV; };
  ....
};

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '==' operator: tp_iAnisotropy == tp_iAnisotropy gfx_wrapper.h 180

The code of Qt (C++) project:

inline bool qCompare(QImage const &t1, QImage const &t2, ....)
{
  ....
  if (t1.width() != t2.width() || t2.height() != t2.height()) {
  ....
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '!=' operator: t2.height() != t2.height() qtest_gui.h 101

The code of FreeBSD (C) project:

static int
compare_sh(const void *_a, const void *_b)
{
  const struct ipfw_sopt_handler *a, *b;

  a = (const struct ipfw_sopt_handler *)_a;
  b = (const struct ipfw_sopt_handler *)_b;
  ....
  if ((uintptr_t)a->handler < (uintptr_t)b->handler)
    return (-1);
  else if ((uintptr_t)b->handler > (uintptr_t)b->handler) // <=
    return (1);
  
  return (0);
}

PVS-Studio warning: V501 There are identical sub-expressions '(uintptr_t) b->handler' to the left and to the right of the '>' operator. ip_fw_sockopt.c 2893

The code of Mono (C#) project:

static bool AreEqual (VisualStyleElement value1, 
                      VisualStyleElement value2)
{
  return
    value1.ClassName == value1.ClassName && // <=
    value1.Part == value2.Part &&
    value1.State == value2.State;
}

PVS-Studio warning: V3001 There are identical sub-expressions 'value1.ClassName' to the left and to the right of the '==' operator. ThemeVisualStyles.cs 2141

The code of Mono (C#) project:

public int ExactInference (TypeSpec u, TypeSpec v)
{
  ....
  var ac_u = (ArrayContainer) u;
  var ac_v = (ArrayContainer) v;
  ....
  var ga_u = u.TypeArguments;
  var ga_v = v.TypeArguments;
  ....
  if (u.TypeArguments.Length != u.TypeArguments.Length) // <=
    return 0;

  ....
}

PVS-Studio warning: V3001 There are identical sub-expressions 'u.TypeArguments.Length' to the left and to the right of the '!=' operator. generic.cs 3135

The code of MonoDevelop (C#) project:

Accessibility DeclaredAccessibility { get; }
bool IsStatic { get; }

private bool MembersMatch(ISymbol member1, ISymbol member2)
{
  if (member1.Kind != member2.Kind)
  {
    return false;
  }

  if (member1.DeclaredAccessibility !=          // <=1
      member1.DeclaredAccessibility             // <=1
   || member1.IsStatic != member1.IsStatic)     // <=2
  {
    return false;
  }

  if (member1.ExplicitInterfaceImplementations().Any() ||  
      member2.ExplicitInterfaceImplementations().Any())
  {
    return false;
  }

  return SignatureComparer
    .HaveSameSignatureAndConstraintsAndReturnTypeAndAccessors(
       member1, member2, this.IsCaseSensitive);
}

PVS-Studio warning: V3001 There are identical sub-expressions 'member1.IsStatic' to the left and to the right of the '!=' operator. CSharpBinding AbstractImplementInterfaceService.CodeAction.cs 545

The code of Haiku (C++) project:

int __CORTEX_NAMESPACE__ compareTypeAndID(....)
{
  int retValue = 0;
  ....
  if (lJack && rJack)
  {
    if (lJack->m_jackType < lJack->m_jackType)           // <=
    {
      return -1;
    }
    if (lJack->m_jackType == lJack->m_jackType)          // <=
    {
      if (lJack->m_index < rJack->m_index)
      {
        return -1;
      }
      else
      {
        return 1;
      }
    }
    else if (lJack->m_jackType > rJack->m_jackType)
    {
      retValue = 1;
    }
  }
  return retValue;
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '<' operator: lJack->m_jackType < lJack->m_jackType MediaJack.cpp 783

Just below there is exactly the same error. As I understand, in both cases a programmer forgot to replace lJack with rJack.

The code of CryEngine V (C++) project:

bool
CompareRotation(const Quat& q1, const Quat& q2, float epsilon)
{
  return (fabs_tpl(q1.v.x - q2.v.x) <= epsilon)
      && (fabs_tpl(q1.v.y - q2.v.y) <= epsilon)
      && (fabs_tpl(q2.v.z - q2.v.z) <= epsilon)     // <=
      && (fabs_tpl(q1.w - q2.w) <= epsilon);
}

PVS-Studio warning: V501 There are identical sub-expressions to the left and to the right of the '-' operator: q2.v.z - q2.v.z entitynode.cpp 93

Pattern: Evaluating the Size of a Pointer Instead of the Size of the Structure/Class

This type of error occurs in programs written in C and C++ and is caused by incorrect use of the sizeof operator. The error in evaluating not the size of the object, but the size of the pointer. Example:

T *a = foo1();
T *b = foo2();
x = memcmp(a, b, sizeof(a));

Instead of the size of the T structure, a size of the pointer gets evaluated. The size of the pointer depends on the used data model, but usually it is 4 or 8. As a result, more or less bites in the memory get compared than take the structure.

Correct variant of the code:

x = memcmp(a, b, sizeof(T));

or

x = memcmp(a, b, sizeof(*a));

Now let's move on to the practical part. Here is how such a bug looks in the code of CryEngine V (C++) code:

bool
operator==(const SComputePipelineStateDescription& other) const
{
  return 0 == memcmp(this, &other, sizeof(this));
}

PVS-Studio warning: V579 The memcmp function receives the pointer and its size as arguments. It is possibly a mistake. Inspect the third argument. graphicspipelinestateset.h 58

The code of Unreal Engine 4 project (C++):

bool FRecastQueryFilter::IsEqual(
  const INavigationQueryFilterInterface* Other) const
{
  // @NOTE: not type safe, should be changed when
  // another filter type is introduced
  return FMemory::Memcmp(this, Other, sizeof(this)) == 0;

}

PVS-Studio warning: V579 The Memcmp function receives the pointer and its size as arguments. It is possibly a mistake. Inspect the third argument. pimplrecastnavmesh.cpp 172

Pattern: Repetitive Arguments of Cmp(A, A) Type

Comparison functions usually call other comparison functions. At the same time one of the possible errors is that the reference/pointer is passed to the same object twice. Example:

x = memcmp(A, A, sizeof(T));

Here the object A will be compared with itself, which, is of course, has no sense.

We'll start with an error, found in the debugger GDB (C):

static int
psymbol_compare (const void *addr1, const void *addr2,
                 int length)
{
  struct partial_symbol *sym1 = (struct partial_symbol *) addr1;
  struct partial_symbol *sym2 = (struct partial_symbol *) addr2;

  return (memcmp (&sym1->ginfo.value, &sym1->ginfo.value,    // <=
                  sizeof (sym1->ginfo.value)) == 0
          && sym1->ginfo.language == sym2->ginfo.language
          && PSYMBOL_DOMAIN (sym1) == PSYMBOL_DOMAIN (sym2)
          && PSYMBOL_CLASS (sym1) == PSYMBOL_CLASS (sym2)
          && sym1->ginfo.name == sym2->ginfo.name);
}

PVS-Studio warning: V549 The first argument of 'memcmp' function is equal to the second argument. psymtab.c 1580

The code of CryEngineSDK project (C++):

inline bool operator != (const SEfResTexture &m) const
{
  if (stricmp(m_Name.c_str(), m_Name.c_str()) != 0 ||   // <=
      m_TexFlags != m.m_TexFlags || 
      m_bUTile != m.m_bUTile ||
      m_bVTile != m.m_bVTile ||
      m_Filter != m.m_Filter ||
      m_Ext != m.m_Ext ||
      m_Sampler != m.m_Sampler)
    return true;
  return false;
}

PVS-Studio warning: V549 The first argument of 'stricmp' function is equal to the second argument. ishader.h 2089

The code of PascalABC.NET (C#):

private List<string> enum_consts = new List<string>();
public override bool IsEqual(SymScope ts)
{
  EnumScope es = ts as EnumScope;
  if (es == null) return false;
  if (enum_consts.Count != es.enum_consts.Count) return false;
  for (int i = 0; i < es.enum_consts.Count; i++)
    if (string.Compare(enum_consts[i],
                       this.enum_consts[i], true) != 0)
      return false;
  return true;
}

PVS-Studio warning: V3038 The 'enum_consts[i]' argument was passed to 'Compare' method several times. It is possible that other argument should be passed instead. CodeCompletion SymTable.cs 2206

I'll give some explanation here. The error in the factual arguments of the Compare function:

string.Compare(enum_consts[i], this.enum_consts[i], true)

The thing is that enum_consts[i] and this.enum_consts[i are the same things. As I understand, a correct call should be like this:

string.Compare(es.enum_consts[i], this.enum_consts[i], true)

or

string.Compare(enum_consts[i], es.enum_consts[i], true)

Pattern: Repetitive Checks A==B && A==B

Quite a common error in programming is when the same check is done twice. Example:

return A == B &&
       C == D &&   // <=
       C == D &&   // <=
       E == F;

Two variants are possible in this case. The first is quite harmless: one comparison is redundant and can be simply removed. The second is worse: some other variables were to be compared, but a programmer made a typo.

In any case, such code deserves close attention. Let me scare you a little more, and show that this error can be found even in the code of GCC compiler (C):

static bool
dw_val_equal_p (dw_val_node *a, dw_val_node *b)
{
  ....
  case dw_val_class_vms_delta:
    return (!strcmp (a->v.val_vms_delta.lbl1,
                     b->v.val_vms_delta.lbl1)
            && !strcmp (a->v.val_vms_delta.lbl1,
                        b->v.val_vms_delta.lbl1));
  ....
}

PVS-Studio warning: V501 There are identical sub-expressions '!strcmp(a->v.val_vms_delta.lbl1, b->v.val_vms_delta.lbl1)' to the left and to the right of the '&&' operator. dwarf2out.c 1428

The function strcmp is called twice with the same set of arguments.

The code of Unreal Engine 4 project (C++):

FORCEINLINE
bool operator==(const FShapedGlyphEntryKey& Other) const
{
  return FontFace == Other.FontFace 
    && GlyphIndex == Other.GlyphIndex   // <=
    && FontSize == Other.FontSize
    && FontScale == Other.FontScale
    && GlyphIndex == Other.GlyphIndex;  // <=
}

PVS-Studio warning: V501 There are identical sub-expressions 'GlyphIndex == Other.GlyphIndex' to the left and to the right of the '&&' operator. fontcache.h 139

The code of Serious Engine project (C++):

inline BOOL CValuesForPrimitive::operator==(....)
{
  return (
 (....) &&
 (vfp_ptPrimitiveType == vfpToCompare.vfp_ptPrimitiveType) &&
 ....
 (vfp_ptPrimitiveType == vfpToCompare.vfp_ptPrimitiveType) &&
 ....
);

PVS-Studio warning: V501 There are identical sub-expressions '(vfp_ptPrimitiveType == vfpToCompare.vfp_ptPrimitiveType)' to the left and to the right of the '&&' operator. worldeditor.h 580

The code of Oracle VM Virtual Box project (C++):

typedef struct SCMDIFFSTATE
{
  ....
  bool  fIgnoreTrailingWhite;
  bool  fIgnoreLeadingWhite;
  ....
} SCMDIFFSTATE;
/* Pointer to a diff state. */

typedef SCMDIFFSTATE *PSCMDIFFSTATE;

/* Compare two lines */
DECLINLINE(bool) scmDiffCompare(PSCMDIFFSTATE pState, ....)
{
  ....
  if (pState->fIgnoreTrailingWhite    // <=
   || pState->fIgnoreTrailingWhite)   // <=
    return scmDiffCompareSlow(....);
  ....
}

PVS-Studio warning: V501 There are identical sub-expressions 'pState->fIgnoreTrailingWhite' to the left and to the right of the '||' operator. scmdiff.cpp 238

Pattern: Incorrect Use of the Value, Returned by memcmp Function

The memcmp function returns the following values of int type:

  • < 0 - buf1 less than buf2;
  • 0 - buf1 identical to buf2;
  • > 0 - buf1 greater than buf2;

Please note that '>0' can be any number, not only 1. These numbers can be: 2, 3, 100, 256, 1024, 5555, 65536 and so on. This means that this result cannot be placed to a variable of the char and short type. The high bits can be lost, which might violate the logic of program execution.

Also this means that the result cannot be compared with constants 1 or -1. In other words, it is wrong to write this:

if (memcmp(a, b, sizeof(T)) == 1)
if (memcmp(x, y, sizeof(T)) == -1)

Correct comparisons:

if (memcmp(a, b, sizeof(T)) > 0)
if (memcmp(a, b, sizeof(T)) < 0)

The danger of this code is that it may successfully work for a long time. The errors may start showing up when moving to a new platform or with the change of the compiler version.

The code of ReactOS project (C++):

HRESULT WINAPI CRecycleBin::CompareIDs(....)
{
  ....
  return MAKE_HRESULT(SEVERITY_SUCCESS, 0,
   (unsigned short)memcmp(pidl1->mkid.abID,
                          pidl2->mkid.abID,
                          pidl1->mkid.cb));
}

PVS-Studio warning: V642 Saving the 'memcmp' function result inside the 'unsigned short' type variable is inappropriate. The significant bits could be lost breaking the program's logic. recyclebin.cpp 542

The code of Firebird project (C++):

SSHORT TextType::compare(ULONG len1, const UCHAR* str1,
ULONG len2, const UCHAR* str2)
{
  ....
  SSHORT cmp = memcmp(str1, str2, MIN(len1, len2));

  if (cmp == 0)
    cmp = (len1 < len2 ? -1 : (len1 > len2 ? 1 : 0));
  return cmp;
}

PVS-Studio warning: V642 Saving the 'memcmp' function result inside the 'short' type variable is inappropriate. The significant bits could be lost breaking the program's logic. texttype.cpp 338

The code of CoreCLR project (C++):

bool operator( )(const GUID& _Key1, const GUID& _Key2) const
  { return memcmp(&_Key1, &_Key2, sizeof(GUID)) == -1; }

PVS-Studio warning: V698 Expression 'memcmp(....) == -1' is incorrect. This function can return not only the value '-1', but any negative value. Consider using 'memcmp(....) < 0' instead. sos util.cpp 142

The code of OpenToonz project (C++):

bool TFilePath::operator<(const TFilePath &fp) const
{
  ....
  char differ;
  differ = _wcsicmp(iName.c_str(), jName.c_str());
  if (differ != 0)
    return differ < 0 ? true : false;
  ....
}

PVS-Studio warning: V642 Saving the '_wcsicmp' function result inside the 'char' type variable is inappropriate. The significant bits could be lost, breaking the program's logic. tfilepath.cpp 328

Pattern: Incorrect Check of Null References

This error pattern is typical for C# programs. Sometimes in the comparison functions programmers write the type casting with the help of the as operator. The error is that inadvertently a programmer verifies against null not the new reference, but the original one. Let's take a look at a synthetic example:

ChildT foo = obj as ChildT;
if (obj == null)
  return false;
if (foo.zzz()) {}

The check if (obj == null) protects from the situation, if the obj variable contains a null reference. However, there is no protection from the case if it turns out that the as operator returns a null reference. The correct code should be like this:

ChildT foo = obj as ChildT;
if (foo == null)
  return false;
if (foo.zzz()) {}

Typically, this error occurs due to negligence of the programmer. Similar bugs are possible in the programs in C and C++, but I haven't found such a case in our error base.

The code of MonoDevelop project (C#):

public override bool Equals (object o)
{
  SolutionItemReference sr = o as SolutionItemReference;
  if (o == null)
    return false;
  return (path == sr.path) && (id == sr.id);
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'o', 'sr'. MonoDevelop.Core SolutionItemReference.cs 81

The code of CoreFX (C#):

public override bool Equals(object comparand)
{
  CredentialHostKey comparedCredentialKey =
                                  comparand as CredentialHostKey;

  if (comparand == null)
  {
    // This covers also the compared == null case
    return false;
  }

  bool equals = string.Equals(AuthenticationType,
        comparedCredentialKey.AuthenticationType, ....
  ....
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'comparand', 'comparedCredentialKey'. CredentialCache.cs 4007

The code of Roslyn project (C#):

public override bool Equals(object obj)
{
  var d = obj as DiagnosticDescription;

  if (obj == null)
    return false;
    
  if (!_code.Equals(d._code))
    return false;
  ....
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'obj', 'd'. DiagnosticDescription.cs 201

The code of Roslyn (C#):

protected override bool AreEqual(object other)
{
  var otherResourceString = other as LocalizableResourceString;
  return
    other != null &&
    _nameOfLocalizableResource == 
      otherResourceString._nameOfLocalizableResource &&
    _resourceManager == otherResourceString._resourceManager &&
    _resourceSource == otherResourceString._resourceSource &&
    ....
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'other', 'otherResourceString'. LocalizableResourceString.cs 121

The code of MSBuild project (C#):

public override bool Equals(object obj)
{
   AssemblyNameExtension name = obj as AssemblyNameExtension;
   if (obj == null)  // <=
   {
     return false;
   }
   ....
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'obj', 'name'. AssemblyRemapping.cs 64

The code of Mono project (C#):

public override bool Equals (object o)
{
  UrlMembershipCondition umc = (o as UrlMembershipCondition);
  if (o == null)                                      // <=
    return false;

  ....

  return (String.Compare (u, 0, umc.Url, ....) == 0); // <=
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'o', 'umc'. UrlMembershipCondition.cs 111

The code of Media Portal 2 project (C#):

public override bool Equals(object obj)
{
  EpisodeInfo other = obj as EpisodeInfo;
  if (obj == null) return false;
  if (TvdbId > 0 && other.TvdbId > 0)
    return TvdbId == other.TvdbId;
  ....
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'obj', 'other'. EpisodeInfo.cs 560

The code of NASA World Wind project (C#):

public int CompareTo(object obj)
{
  RenderableObject robj = obj as RenderableObject;
  if(obj == null)                                 // <=
    return 1;
  return this.m_renderPriority.CompareTo(robj.RenderPriority);
}

PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'obj', 'robj'. RenderableObject.cs 199

Pattern: Incorrect Loops

In some functions, collections of items are compared. Of course, different variant of the loops are used for its comparison. If a programmer writes the code inattentively, it's easy to mix something up, as it is with the comparison functions. Let's look at a few of these situations.

The code of Trans-Proteomic Pipeline (C++):

bool Peptide::operator==(Peptide& p) {
  ....
  for (i = 0, j = 0;
       i < this->stripped.length(), j < p.stripped.length();
       i++, j++) { 
  ....
}

PVS-Studio warning: V521 Such expressions using the ',' operator are dangerous. Make sure the expression is correct. tpplib peptide.cpp 191

Note that the comma operator is used in the condition. The code is clearly incorrect, because the condition, written to the left of the coma is ignored. That is, the condition on the left is evaluated, but its result is not used in any way.

The code of Qt project (C++):

bool equals( class1* val1, class2* val2 ) const
{
  ...
  size_t size = val1->size();
  ...
  while ( --size >= 0 ){
    if ( !comp(*itr1,*itr2) )
      return false;
    itr1++;
    itr2++;
  }
  ...
}

PVS-Studio warning: V547 Expression '-- size >= 0' is always true. Unsigned type value is always >= 0. QtCLucene arrays.h 154

The code of CLucene project (C++):

class Arrays
{
  ....
   bool equals( class1* val1, class2* val2 ) const{
     static _comparator comp;
     if ( val1 == val2 )
       return true;
     size_t size = val1->size();
     if ( size != val2->size() )
       return false;
     _itr1 itr1 = val1->begin();
     _itr2 itr2 = val2->begin();
     while ( --size >= 0 ){
       if ( !comp(*itr1,*itr2) )
         return false;
       itr1++;
       itr2++;
     }
   return true;
  }
  ....
}

PVS-Studio warning: V547 Expression '-- size >= 0' is always true. Unsigned type value is always >= 0. arrays.h 154

The code of Mono project (C#):

public override bool Equals (object obj)
{
  ....
  for (int i=0; i < list.Count; i++) {
    bool found = false;
    for (int j=0; i < ps.list.Count; j++) {     // <=
      if (list [i].Equals (ps.list [j])) {
        found = true;
        break;
      }
    }
    if (!found)
      return false;
  }
  return true; 
}

PVS-Studio warning: V3015 It is likely that a wrong variable is being compared inside the 'for' operator. Consider reviewing 'i' corlib-net_4_x PermissionSet.cs 607

Apparently, there is a typo here, and the variable j instead of i should be used in the nested loop:

for (int j=0; j < ps.list.Count; j++)

Pattern: A = getA(), B = GetA()

Quite often in the comparison functions a programmer has to write code of this kind:

if (GetA().x == GetB().x && GetA().y == GetB().y)

Intermediate variables are used to reduce the size of the conditions or for optimization:

Type A = GetA();
Type B = GetB();
if (A.x == B.x && A.y == B.y)

But inadvertently, a person sometimes makes a mistake and initializes temporary variables with the same value:

Type A = GetA();
Type B = GetA();

Now let's take a look at these errors in the code of real applications.

The code of LibreOffice project (C++):

bool CmpAttr(
  const SfxPoolItem& rItem1, const SfxPoolItem& rItem2)
{
  ....
  bool bNumOffsetEqual = false;
  ::boost::optional<sal_uInt16> oNumOffset1 =
        static_cast<const SwFmtPageDesc&>(rItem1).GetNumOffset();
  ::boost::optional<sal_uInt16> oNumOffset2 =
        static_cast<const SwFmtPageDesc&>(rItem1).GetNumOffset();

  if (!oNumOffset1 && !oNumOffset2)
  {
    bNumOffsetEqual = true;
  }
  else if (oNumOffset1 && oNumOffset2)
  {
    bNumOffsetEqual = oNumOffset1.get() == oNumOffset2.get();
  }
  else
  {
    bNumOffsetEqual = false;
  }
  ....
}

PVS-Studio warning: V656 Variables 'oNumOffset1', 'oNumOffset2' are initialized through the call to the same function. It's probably an error or un-optimized code. Check lines: 68, 69. findattr.cxx 69

The code of Qt project (C++):

AtomicComparator::ComparisonResult
IntegerComparator::compare(const Item &o1,
                           const AtomicComparator::Operator,
                           const Item &o2) const
{
  const Numeric *const num1 = o1.as<Numeric>();
  const Numeric *const num2 = o1.as<Numeric>();
 
  if(num1->isSigned() || num2->isSigned())
  ....
}

PVS-Studio warning: V656 Variables 'num1', 'num2' are initialized through the call to the same function. It's probably an error or un-optimized code. Consider inspecting the 'o1.as < Numeric > ()' expression. Check lines: 220, 221. qatomiccomparators.cpp 221

Pattern: Sloppy Copying of the Code

A large amount of errors, cited previously can be called the consequences of sloppy Copy-Paste. They fell under some categories of the erroneous pattern and I decided that it would be logical to describe them in corresponding sections. However, I have several errors that have clearly appeared because of sloppy code copying, but I have no idea how to classify them. That's why I collected these errors here.

The code of CoreCLR project (C++):

int __cdecl Compiler::RefCntCmp(const void* op1, const void* op2)
{
  ....
  if (weight1)
  {
    ....
    if (varTypeIsGC(dsc1->TypeGet()))
    {
      weight1 += BB_UNITY_WEIGHT / 2;
    }
    if (dsc1->lvRegister)
    {
      weight1 += BB_UNITY_WEIGHT / 2;
    }
  }

  if (weight1)
  {
    ....
    if (varTypeIsGC(dsc2->TypeGet()))
    {
      weight1 += BB_UNITY_WEIGHT / 2;       // <=
    }
    if (dsc2->lvRegister)
    {
      weight2 += BB_UNITY_WEIGHT / 2;
    }
  }
  ....
}

PVS-Studio warning: V778 Two similar code fragments were found. Perhaps, this is a typo and 'weight2' variable should be used instead of 'weight1'. clrjit lclvars.cpp 2702

The function was long that's why it is shortened for the article. If we examine the code of the function, we'll see that a part of the code was copied, but in one fragment a programmer forgot to replace the variable weight1 with weight2.

The code of WPF samples by Microsoft project (C#):

public int Compare(GlyphRun a, GlyphRun b)
{
  ....
  if (aPoint.Y > bPoint.Y)      // <=
  {
    return -1;
  }
  else if (aPoint.Y > bPoint.Y) // <=
  {
    result = 1;
  }
  else if (aPoint.X < bPoint.X)
  {
    result = -1;
  }
  else if (aPoint.X > bPoint.X)
  {
    result = 1;
  }
  ....
}

PVS-Studio warning: V3003 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 418, 422. txtserializerwriter.cs 418

The code of PascalABC.NET project (C#):

public void CompareInternal(....)
{
  ....
  else if (left is int64_const)
    CompareInternal(left as int64_const, right as int64_const);
  ....
  else if (left is int64_const)
    CompareInternal(left as int64_const, right as int64_const);  
  ....
}

PVS-Studio warning: V3003 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 597, 631. ParserTools SyntaxTreeComparer.cs 597

The code of SharpDevelop project (C#):

public int Compare(SharpTreeNode x, SharpTreeNode y)
{
  ....
  if (typeNameComparison == 0) {
    if (x.Text.ToString().Length < y.Text.ToString().Length)
      return -1;
    if (x.Text.ToString().Length < y.Text.ToString().Length)
      return 1;
  }  
  ....
}

PVS-Studio warning: V3021 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless NamespaceTreeNode.cs 87

The code of Coin3D (C++):

int
SbProfilingData::operator == (const SbProfilingData & rhs) const
{
  if (this->actionType != rhs.actionType) return FALSE;
  if (this->actionStartTime != rhs.actionStopTime) return FALSE;
  if (this->actionStartTime != rhs.actionStopTime) return FALSE;
  ....
}

PVS-Studio warning: V649 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains function return. This means that the second 'if' statement is senseless. Check lines: 1205, 1206. sbprofilingdata.cpp 1206

The code of Spring (C++):

bool operator < (const aiFloatKey& o) const
  {return mTime < o.mTime;}
bool operator > (const aiFloatKey& o) const
  {return mTime < o.mTime;}

PVS-Studio warning: V524 It is odd that the body of '>' function is fully equivalent to the body of '<' function. assimp 3dshelper.h 470

And here is the last, particularly interesting code fragment that PVS-Studio analyzer found in MySQL project (C++).

static int rr_cmp(uchar *a,uchar *b)
{
  if (a[0] != b[0])
    return (int) a[0] - (int) b[0];
  if (a[1] != b[1])
    return (int) a[1] - (int) b[1];
  if (a[2] != b[2])
    return (int) a[2] - (int) b[2];
  if (a[3] != b[3])
    return (int) a[3] - (int) b[3];
  if (a[4] != b[4])
    return (int) a[4] - (int) b[4];
  if (a[5] != b[5])
    return (int) a[1] - (int) b[5]; // <=
  if (a[6] != b[6])
    return (int) a[6] - (int) b[6];
  return (int) a[7] - (int) b[7];
}

PVS-Studio warning: V525 The code containing the collection of similar blocks. Check items '0', '1', '2', '3', '4', '1', '6' in lines 680, 682, 684, 689, 691, 693, 695. sql records.cc 680

Most likely, a programmer wrote the first comparison, then the second and got bored. So he copied to the buffer a text block:

if (a[1] != b[1])
  return (int) a[1] - (int) b[1];

A pasted it to the text of the program as many times as he needed. Then he changed indexes, but made a mistake in one place and got an incorrect comparison:

if (a[5] != b[5])
  return (int) a[1] - (int) b[5];

Note. I discuss this error in more detail in my mini-book "The Ultimate Question of Programming, Refactoring, and Everything" (see a chapter "Don't do the compiler's job").

Pattern: Equals Method Incorrectly Processes a Null Reference

In C# the accepted practice is to implement the Equals methods in such a way, so that they correctly process a situation, if a null reference is passed as an argument. Unfortunately, not all the methods are implemented according to this rule.

The code of GitExtensions (C#):

public override bool Equals(object obj)
{
  return GetHashCode() == obj.GetHashCode(); // <=
}

PVS-Studio warning: V3115 Passing 'null' to 'Equals(object obj)' method should not result in 'NullReferenceException'. Git.hub Organization.cs 14

The code of PascalABC.NET project (C#):

public override bool Equals(object obj)
{
  var rhs = obj as ServiceReferenceMapFile;
  return FileName == rhs.FileName;
}

PVS-Studio warning: V3115 Passing 'null' to 'Equals' method should not result in 'NullReferenceException'. ICSharpCode.SharpDevelop ServiceReferenceMapFile.cs 31

Miscellaneous Errors

The code of G3D Content Pak project (C++):

bool Matrix4::operator==(const Matrix4& other) const {
  if (memcmp(this, &other, sizeof(Matrix4) == 0)) {
    return true;
  }
  ...
}

PVS-Studio warning: V575 The 'memcmp' function processes '0' elements. Inspect the 'third' argument. graphics3D matrix4.cpp 269

One closing bracket is put incorrectly. As a result, the amount of bites compared is evaluated by the statement sizeof(Matrix4) == 0. The size of any class is more than 0, which means that the result of the expression is 0. Thus, 0 bites get compared.

Correct variant:

if (memcmp(this, &other, sizeof(Matrix4)) == 0) {

The code of Wolfenstein 3D project (C++):

inline int operator!=( quat_t a, quat_t b )
{
  return ( ( a.x != b.x ) || ( a.y != b.y ) ||
           ( a.z != b.z ) && ( a.w != b.w ) );
}

PVS-Studio warning: V648 Priority of the '&&' operation is higher than that of the '||' operation. math_quaternion.h 167

Apparently, in one fragment the && operator was accidentally written instead of ||.

The code of FlightGear project (C):

static int tokMatch(struct Token* a, struct Token* b)
{
  int i, l = a->strlen;
  if(!a || !b) return 0;
  ....
}

PVS-Studio warning: V595 The 'a' pointer was utilized before it was verified against nullptr. Check lines: 478, 479. codegen.c 478

If we pass NULL as the first argument to the function, we'll get null pointer dereference, although the programmer wanted the function to return 0.

The code of WinMerge project (C++):

int TimeSizeCompare::CompareFiles(int compMethod,
                                  const DIFFITEM &di)
{
  UINT code = DIFFCODE::SAME;
  ...
  if (di.left.size != di.right.size)
  {
    code &= ~DIFFCODE::SAME;
    code = DIFFCODE::DIFF;
  }
  ...
}

PVS-Studio warning: V519 The 'code' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 79, 80. Merge timesizecompare.cpp 80

The code of ReactOS project (C++):

#define IsEqualGUID(rguid1, rguid2) \
  (!memcmp(&(rguid1), &(rguid2), sizeof(GUID)))

static int ctl2_find_guid(....)
{
  MSFT_GuidEntry *guidentry;
  ...
  if (IsEqualGUID(guidentry, guid)) return offset;
  ...
}

PVS-Studio warning: V512 A call of the 'memcmp' function will lead to underflow of the buffer 'guidentry'. oleaut32 typelib2.c 320

A pointer is written here as the first argument. As a result, the address of the pointer gets evaluated, which has no sense.

Correct variant:

if (IsEqualGUID(*guidentry, guid)) return offset;

The code of IronPython and IronRuby project (C#):

public static bool Equals(float x, float y) {
  if (x == y) {
    return !Single.IsNaN(x);
  }
  return x == y;
}

PVS-Studio warning: V3024 An odd precise comparison: x == y. Consider using a comparison with defined precision: Math.Abs(A - B) < Epsilon. FloatOps.cs 1048

It's not clear what is the point of a special check against NaN here. If the condition (x == y) is true, it means that both x and y and different from NaN, because NaN isn't equal to any other value, including itself. It seems that the check against NaN is just not necessary, and the code can be shortened to:

public static bool Equals(float x, float y) {
  return x == y;
}

The code of Mono project (C#):

public bool Equals (CounterSample other)
{
  return
    rawValue         == other.rawValue         &&
    baseValue        == other.counterFrequency &&   // <=
    counterFrequency == other.counterFrequency &&   // <=
    systemFrequency  == other.systemFrequency  &&
    timeStamp        == other.timeStamp        &&
    timeStamp100nSec == other.timeStamp100nSec &&
    counterTimeStamp == other.counterTimeStamp &&
    counterType      == other.counterType;
}

PVS-Studio warning: V3112 An abnormality within similar comparisons. It is possible that a typo is present inside the expression 'baseValue == other.counterFrequency'. System-net_4_x CounterSample.cs 139

How Do these Programs Work at all?

Looking through all the errors, it seems miraculous that all these programs generally work. Indeed, the comparison functions do a very important and responsible task in program.

There are several explanations of why these programs work despite these errors:

  1. In a lot of functions, only a part of the object is compared incorrectly. The partial comparison is enough for most of the tasks in this program.
  2. There are no situations (yet) when the function works incorrectly. For example, this applies to the functions that aren't protected from null pointers or those, where the result of the memcmp function call is placed into the variable of char type. The program is simply lucky.
  3. The reviewed comparison function is used very rarely or not used at all.
  4. Who said that the program is working? A lot of programs really do something wrong!

Recommendations

I demonstrated how many errors can be found in the comparison functions. It follows that the efficiency of these functions should be checked with unit-tests by all means.

It is really necessary to write unit-tests for the comparison operators, for Equals functions and so on.

I am quite sure that there was such an understanding among programmers before reading this article, that unit tests for such functions is extra work and they won't detect any errors anyway: the comparison functions are just so simple at the first glance... Well, now I showed the horror that can hide in them.

Code reviews and using static analysis tools would also be a great help.

Conclusion

In this article we mentioned a large amount of big-name projects that are developed by highly qualified experts. These projects are thoroughly tested using different methodologies. Still, it didn't stop PVS-Studio from finding errors in them. This shows that PVS-Studio can become a nice complement to other methodologies used to improve the quality and reliability of the code.

Visit our site and try PVS-Studio yourself.

by Andrey Karpov

Thursday, May 11, 2017

Comparing PVS-Studio for C# and a built-in Visual Studio analyzer, using the CruiseControl.NET codebase

Recently I have done comparison of C# analyzers by PVS-Studio and SonarQube on the base of PascalABC.NET code. The research turned out to be pretty engaging, so I decided to continue working in this direction. This time I compared a C# analyzer of PVS-Studio with a static analyzer built into Visual Studio. In my opinion, this is a very worthy adversary. Despite the fact that the analyzer from the Visual Studio kit is primarily designed to improve the quality of the code, not to look for bugs, this does not mean that it cannot be used to detect real errors, although this may be not easy. Let's see which peculiarities in the work of the analyzers will be detected in the course of our investigation. Let's start!
Picture 6

Introduction

First, here is a small FAQ section to clarify some points.
Q: Why CruiseControl.NET? What project is this?

A: CruiseControl.NET is an Automated Continuous Integration server, implemented using the .NET Framework. The source code of CruiseControl.NET is available on GitHub. The project has not been supported and developed for some time already, although it was quite popular until recently. This won't hinder the comparison of the analyzers, but on the contrary, will bring some element of stability. We can be sure that no one has improved the code using the latest version of PVS-Studio analyzer or the analyzer built in Visual Studio. An additional advantage is the small size of CruiseControl.NET: the project has approximately 256 thousand code lines.
Q: Have you used Visual Studio 2017? We would like to see the features of the latest versions of the analysis tools.
A: We used Visual Studio 2017 Community for the analysis of both tools.
Q: What about the settings of the analyzers? Perhaps, everything was "set up on purpose" and that's why PVS-Studio turned out to be better?
A: For both analyzers we used the "default" settings. For the testing integrity, we did the research on a "clean" machine with Windows 10.
Q: Well, good. But surely, you have juggled with the results or did the calculations not quite correctly? For example, in PVS-Studio you could ignore the "Low" level of certainty, by taking just "High" and "Medium" levels. Then PVS-Studio will have an advantage over the analyzer built into Visual Studio, since the latter does not have similar settings.
A: In the analysis of the results we took into account absolutely all alert levels and included all of the available types of diagnostics.
Q: What about the selection of files for analysis? Did you add anything to the exceptions, for example, Unit-tests?
A: We did the analysis of the entire solution for both analyzers, without any exceptions. I should also note that CruiseControl.NET does have a project "UnitTests". There were quite a lot of warnings issued for this project, but they were all not taken into account during the search of real errors, although they appear in the total number of issued warnings.
Q: Real errors? What term is that?
A: In our opinion, these are errors, critical for the program performance, that will likely lead to the exception throw, incorrect program behavior or incorrect results. These are errors that should be fixed immediately. These are not just recommendations of the design improvement or minor flaws like code duplication that do not affect the result. Here is an example of a real bug in CruiseControl.NET:
public override string Generate(IIntegrationResult integrationResult)
{
  ....
  IntegrationSummary lastIntegration = 
    integrationResult.LastIntegration;    // <=
  
  if (integrationResult == null || ....)  // <=
  {
    ....
  }
  ....
}
A lot of analyzers will issue a warning for the given fragment that the variable integrationResult is used without the prior check against null. That's right, but it usually leads to a large number of false positives, among which it is very difficult to find a real error. Our approach is to conduct additional analysis, which increases the probability of detecting real errors. In the code fragment given above, we see that after the variable is used, it is verified against null. I.e. in this case the programmer assumes that the value of a variable can be null after passing to the method, and writes the check. This is exactly the situation that we will consider to be erroneous. If the method didn't have the check of integrationResult against null, then we would consider this to be a false positive:
public override string Generate(IIntegrationResult integrationResult)
{
  ....
  IntegrationSummary lastIntegration = 
    integrationResult.LastIntegration;
  ....
}
PVS-Studio will not issue a warning for this code, while a number of analyzers will. As a result, these warnings will be ignored or disabled. A large number of warnings does not mean that they may be useful.
Q: Suppose, you did everything right. But why should I take all this for granted? How can I repeat your investigation?
A: Nothing could be simpler. The analyzer built in Visual Studio is free. You should just install the free version of Visual Studio 2017 Community. To do the analysis of CruiseControl.NET using PVS-Studio, you will have to just load it and use the demo version. Yes, some limitations of the demo version won't allow doing the complete analysis, but you can write to us and we may give a temporary license key to you.

The research and results

Visual Studio

It took just a couple of minutes to check the code of the project with the help of the analyzer, built in Visual Studio. Right after the analysis, we see the following results (no filters are enabled):
Picture 1
The analyzer issued 10773 warnings. Yes, it won't be an easy thing to find errors here. For a start, I will exclude the warnings issued for "Unit tests: with the help of the filter:
Picture 2
Ok. Almost half of the warnings were issued for the tests, which is of no surprise. But more than 5 thousand remaining messages-not too little. These warnings be grouped as follows:
Microsoft.Design: CA10XX (diagnostics:40, warnings: 1637) 
Microsoft.Globalization: CA13XX (diagnostics: 7, warnings: 1406)
Microsoft.Interoperability: CA14XX (diagnostics: 2, warnings: 10)
Microsoft.Maintainability: CA15XX (diagnostics: 3, warnings: 74)
Microsoft.Mobility: CA16XX (diagnostics: 1, warnings: 1)
Microsoft.Naming: CA17XX (diagnostics: 17, warnings: 1071)
Microsoft.Performance: CA18XX (diagnostics: 15, warnings: 489)
Microsoft.Portability: CA19XX (diagnostics: 1, warnings: 4)
Microsoft.Reliability: CA20XX (diagnostics: 4, warnings: 158)
Microsoft.Globalization, Microsoft.Security: CA21XX (diagnostics: 5,
warnings: 48)
Microsoft.Usage: CA22XX (diagnostics: 18, warnings: 440)
Also, there were several compiler warnings issued. Apparently, there was no other choice than to study the description of every analyzer diagnostic and then examine the warnings if necessary.
I should say that I took time to do this and did not manage to find anything that would help to find errors among more than five thousand of warnings issued by the analyzer. In most cases it all boils down to the recommendations about the design improvement and code optimization. As there are too many of these warnings, I am not going to cite here the full list with the descriptions. If you wish, you can carefully examine this list yourself by checking the project using CruiseControl.NET analyzer built into Visual Studio. A detailed description of diagnostics is available on MSDN.
I have examined a substantial part but not all of the warnings. There was no point in reviewing all of the groups till the end, as they were all similar and they were obviously not errors. Not to be unfounded, I'll cite one example for each group.
Microsoft.Design
CA1002 Change 'List<NameValuePair>' in 'CruiseServerClient.ForceBuild(string, List<NameValuePair>)' to use Collection<T>, ReadOnlyCollection<T> or KeyedCollection<K,V> CruiseServerClient.cs 118
public override void ForceBuild(...., List<NameValuePair> parameters)
{
  ....
}
This is a recommendation to use a universal collection (for example, Collection), instead of List for the parameters parameter of the method.
Microsoft.Globalization
CA1300 Change 'AddProjects.RetrieveListOfProjects(BuildServer)' to call the MessageBox.Show overload that specifies MessageBoxOptions, and make sure to set MessageBoxOptions.RightAlign and MessageBoxOptions.RtlReading if RightToLeft is set to RightToLeft.Yes on the parent control. CCTrayLib AddProjects.cs 86
private void RetrieveListOfProjects(....)
{
  ....
  MessageBox.Show(this, "Unable to connect to server " +
server.DisplayName + ": " + ex.Message, "Error");
  ....
}
Here is a recommendation to use an overload of the method MessageBox.Show() that takes an argument MessageBoxOptions. This is necessary to improve the support a multi-language interface and languages that use right-to-left reading order.
Microsoft.Interoperability
CA1401 Change the accessibility of P/Invoke 'NativeMethods.SetForegroundWindow(IntPtr)' so that it is no longer visible from outside its assembly. CCTrayLib NativeMethods.cs 12
[DllImport("user32.dll")]
[return: MarshalAs(UnmanagedType.Bool)]
public static extern bool SetForegroundWindow(IntPtr handle);
Here is a recommendation that you should not specify the public level of access for the methods with the DllImportAttribute attribute.
Microsoft.Maintainability
CA1500 'errorMessages', a variable declared in 'Response.ConcatenateErrors()', has the same name as an instance field on the type. Change the name of one of these items. Remote Response.cs 152
private List<ErrorMessage> errorMessages;
....
public virtual string ConcatenateErrors()
{
  List<string> errorMessages = new List<string>();
  ....
}
This is warning that a local variable has the same name as the class field.
Microsoft.Mobility
CA1601 Modify the call to 'Timer.Timer(double)' in method FileChangedWatcher.FileChangedWatcher(params string[])' to set the timer interval to a value that's greater than or equal to one second. core FileChangedWatcher.cs 33
public FileChangedWatcher(....)
{
  ....
  timer = new Timer(500);
  ....
}
This warning is that the timer interval is set to less than one second.
Microsoft.Naming
CA1702 In member 'Alienbrain.CreateGetProcess(string)', the discrete term 'filename' in parameter name 'filename' should be expressed as a compound word, 'fileName'. core Alienbrain.cs 378
public ProcessInfo CreateGetProcess(string filename)
{
  ....
}
A warning about the need to use Camel Case for naming the compound variable names.
Microsoft.Performance
CA1800 'action', a variable, is cast to type 'AdministerAction' multiple times in method 'AdministerPlugin.NamedActions.get()'. Cache the result of the 'as' operator or direct cast in order to eliminate the redundant isint instruction. WebDashboard AdministerPlugin.cs 79
public INamedAction[] NamedActions
{
  get
  {
    ....
    if (action is AdministerAction)
    {
      (action as AdministerAction).Password = password;
    }
    ....
  }
  ....
}
A warning about the necessity to optimize the iterative type casting.
Microsoft.Portability
CA1901 As it is declared in your code, parameter 'fdwSound' of P/Invoke 'Audio.PlaySound(byte[], short, long)' will be 8 bytes wide on 32-bit platforms. This is not correct, as the actual native declaration of this API indicates it should be 4 bytes wide on 32-bit platforms. Consult the MSDN Platform SDK documentation for help determining what data type should be used instead of 'long'. CCTrayLib Audio.cs 135
[DllImport ("winmm.dll")]
private static extern int PlaySound (byte[] pszSound, Int16 hMod,
long fdwSound);
A warning that a non-portable type is used in the imported method for the parameter fdwSound. It's necessary to use IntPtr or UIntPtr.
Microsoft.Reliability
CA2000 In method 'About.famfamfamLink_LinkClicked(object, LinkLabelLinkClickedEventArgs)', call System.IDisposable.Dispose on object 'urlLink' before all references to it are out of scope. CCTrayLib About.cs 71
private void famfamfamLink_LinkClicked(....)
{
  Process urlLink = new Process();
  urlLink.StartInfo = new ProcessStartInfo(....);
  urlLink.Start();
}
A warning to free an IDisposable-object urlLink before it is out of scope. For example, you can write using.
Microsoft.Globalization, Microsoft.Security
CA2101 To reduce security risk, marshal parameter 'lpszDomain' as Unicode, by setting DllImport.CharSet to CharSet.Unicode, or by explicitly marshaling the parameter as UnmanagedType.LPWStr. If you need to marshal this string as ANSI or system-dependent, specify MarshalAs explicitly, and set BestFitMapping=false; for added security, also set ThrowOnUnmappableChar=true. core Impersonation.cs 100
[DllImport("advapi32.dll", SetLastError = true)]
private static extern bool LogonUser(
        string lpszUsername,
        string lpszDomain,
        string lpszPassword,
        int dwLogonType,
        int dwLogonProvider,
        ref IntPtr phToken);
The warning that the type of marshaling for the string arguments is not specified, for example by defining the attributes as follows:
[DllImport("advapi32.dll", SetLastError = true,
CharSet = CharSet.Unicode)]
Microsoft.Usage
CA2201 'CruiseServerClientFactory.GenerateClient(string, ClientStartUpSettings)' creates an exception of type 'ApplicationException', an exception type that is not sufficiently specific and should never be raised by user code. If this exception instance might be thrown, use a different exception type. Remote CruiseServerClientFactory.cs 97
public CruiseServerClientBase GenerateClient(....)
{
  ....
  throw new ApplicationException("Unknown transport protocol");
  ....
}
The warning about the exception throw of a too general type. In this case it is recommended to throw an exception of a more specific type, not reserved in the execution environment.
Result
Doing this work, I came to the conclusion that in this case it makes sense to do the search of real errors only for the warnings with the code CA1062 from the group Microsoft.Design. This is a warning for situations where a method parameter has a reference type and there is no verification against null before its usage. After applying the filter for such warnings, we get the following:
Picture 3
733 - is still a lot. But it's already something. And if we review the detected code fragments, they are really potentially unsafe. For example, in the file ItemStatus.cs:
public void AddChild(ItemStatus child)
{
  child.parent = this;
  childItems.Add(child);
}
The child reference to the instance of the ItemStatus class is not checked before it is used. Yes, it is dangerous. But, unfortunately, this can not be called a mistake. Perhaps, the checks may be located in the calling code, although it is not correct. Moreover, the method is declared as public. Of course, the author of the code should take measures and to deal with these warnings, but let me remind you that there are 733 of them. Most likely, the programmer will do nothing because "everything works". This is exactly the danger of issuing a lot of warnings for everything that seems more or less suspicious. For this reason, I gave an example of a real error that a developer should pay attention to. The warnings like this can be considered as false positives. It is really so.
Having spent some more time, I found 5 warnings among those 733 that can be interpreted as errors. Here is an example of one of them (BuildGraph.cs file):
public override bool Equals(object obj)
{
  if (obj.GetType() != this.GetType() )
    return false;
  ....
}
The obj variable is not verified against null before the use. Since we are talking about the overloaded Equals method - we are dealing with an error. The method Equals has to correctly process null references. Perhaps, such situations never occur in CruiseControl.NET project, but the code of the method is still erroneous and it should be fixed.
A meticulous reader might argue that I may have missed such an error without having studied every use of the methods. Perhaps, that is possible. But in practice, no one will examine every warning carefully, but the percentage of the false positives is still very large.
I should note that despite the fact that I managed to find errors in the code of CruiseControl.NET using the analyzer, built in Visual Studio, the process itself took about 8 hours (the whole work day) and it required additional attention and concentration. Perhaps, if the project authors used the static analysis regularly, the overall picture would be more positive.

PVS-Studio

The analysis of the project with PVS-Studio on my machine took a minute. Right after that the results are the following (none of the filters are enabled):
Picture 4
The analyzer issued 198 warnings. 45 warnings were issued for the project "UnitTests", and 32 more warnings have a low priority (it's less likely that there are real errors among them). Subtotal - 121 messages for analysis, which took me 30 minutes. As a result, 19 errors have been identified:
Picture 5
Here is an example of one of them:
V3003 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 120, 125. CCTrayLib CCTrayProject.cs 120
public override bool Equals(object obj)
{
  ....
  if ((buildServer != null) && 
      (objToCompare.buildServer != null))
  {
    // If both instances have a build server then compare the build
    // server settings
    isSame = string.Equals(buildServer.Url,
objToCompare.buildServer.Url);
  }
  else if ((buildServer != null) && 
(objToCompare.buildServer != null))
  {
    // If neither instance has a build server then they are the same
    isSame = true;
  }
  ....
}
Both if blocks contain the same condition. We see a serious bug affecting the logic of the program and resulting in an unexpected result.
I think here I have nothing more to add here. PVS-Studio quickly and accurately did its work in finding real errors. This is exactly what it was made for.

Conclusion

Here is a table showing the results:
Picture 16
Of course, we see that PVS-Studio has a bigger advantage. But again, the analyzer built in Visual Studio was made to improve the design and optimize the code, not for the bug search. While PVS-Studio, on the contrary, was "aimed" at the bug search with the lowest possible percentage of false alarms. Besides that, the developers of CruiseControl.NET, apparently didn't use any analyzers at all. I am sure that if they used the analyzer built in Visual Studio, the quality of the code would be much better, and the possibility of an error is lower, not to mention PVS-Studio. Such tools allow achieving maximum effect if used regularly, rather than "once a year".
Download and try PVS-Studio: http://www.viva64.com/en/pvs-studio/
To purchase a commercial license, please contact us via the email. You can also write to us to get a temporary license key for a comprehensive investigation of PVS-Studio, if you want to avoid the limitations of the demo version.

Additional links