Programming C++, C#: tutorial

Showing posts with label tutorial. Show all posts

Thursday, April 20, 2017

ClueCon Weekly - April 19th 2017 - PVS analyzer

The guys from the PVS analyzer crew stop by to talk about their analyzer.

Tuesday, April 4, 2017

Critical errors in CryEngine V code

In May 2016, German game-development company Crytek made the, decision to upload the source code of their game engine, 'CryEngine V' to GitHub. The project is in active development, which leads to a large number of errors in the code. We have already checked the project with PVS-Studio for Windows, and now we can also analyze it using PVS-Studio for Linux. There was enough material for an article with the description of only crucial errors.

Introduction

CryEngine is a game engine created by the German company Crytek in the year 2002, and originally used in the first-person shooter Far Cry. There are a lot of great games made on the basis of different versions of CryEngine, by many studios who have licensed the engine: Far Cry, Crysis, Entropia Universe, Blue Mars, Warface, Homefront: The Revolution, Sniper: Ghost Warrior, Armored Warfare, Evolve and many others. In March 2016 the Crytek company announced the release of the new CryEngine V, and soon after, posted the source code on GitHub.

To perform the source code analysis, we used the PVS-Studio for Linux. Now, it has become even more convenient for the developers of cross-platform projects to track the quality of their code, with one static analysis tool. The Linux version can be downloaded as an archive, or a package for a package manager. You can set up the installation and update for the majority of distributions, using our repository.

This article only covers the general analysis warnings, and only the "High" certainty level (there are also "Medium" and "Low"). To be honest, I didn't even examine all of the "High" level warnings, because there was already enough material for an article after even a quick look. I started working on the article several times over a period of a few months, so I can say with certainty that the bugs described here have living in the code for some months already. Some of the bugs that had been found during the previous check of the project, also weren't fixed.

It was very easy to download and check the source code in Linux. Here is a list of all necessary commands:

mkdir ~/projects && cd ~/projects
git clone https://github.com/CRYTEK/CRYENGINE.git
cd CRYENGINE/
git checkout main
chmod +x ./download_sdks.py
./download_sdks.py
pvs-studio-analyzer trace -- \
  sh ./cry_waf.sh build_linux_x64_clang_profile -p gamesdk
pvs-studio-analyzer analyze \
  -l /path/to/PVS-Studio.lic \
  -o ~/projects/CRYENGINE/cryengine.log \
  -r ~/projects/CRYENGINE/ \
  -C clang++-3.8 -C clang-3.8 \
  -e ~/projects/CRYENGINE/Code/SDKs \
  -j4

plog-converter -a GA:1,2 -t tasklist \
  -o ~/projects/CRYENGINE/cryengine_ga.tasks \
  ~/projects/CRYENGINE/cryengine.log

The report file cryengine_ga.tasks can be opened and viewed in QtCreator. What did we manage to find in the source code of CryEngine V?

A strange Active() function

V501 There are identical sub-expressions to the left and to the right of the '==' operator: bActive == bActive LightEntity.h 124

void SetActive(bool bActive)
{
  if (bActive == bActive)
    return;

  m_bActive = bActive;
  OnResetState();
}

The function does nothing because of a typo. It seems to me that if there was a contest, "Super Typo", this code fragment would definitely take first place. I think this error has every chance to get into the section, "C/C++ bugs of the month".

But that's not all, here is a function from another class:

V501 There are identical sub-expressions 'm_staticObjects' to the left and to the right of the '||' operator. FeatureCollision.h 66

class CFeatureCollision : public CParticleFeature
{
public:
  CRY_PFX2_DECLARE_FEATURE

public:
  CFeatureCollision();
  ....

  bool  IsActive() const  { return m_terrain ||
m_staticObjects ||
m_staticObjects; }
  ....
  bool m_terrain;
  bool m_staticObjects;
  bool m_dynamicObjects;
};

The variable m_staticObjects is used twice in the function IsActive(), although there is an unused variable m_dynamicObjects. Perhaps, it was this variable that was meant to be used.

Code above has no bugs

V547 Expression 'outArrIndices[i] < 0' is always false. Unsigned type value is never < 0. CGFLoader.cpp 881

static bool CompactBoneVertices(....,
  DynArray<uint16>& outArrIndices, ....)           // <= uint16
{
  ....
  outArrIndices.resize(3 * inFaceCount, -1);

  int outVertexCount = 0;
  for (int i = 0; i < verts.size(); ++i)
  {
    ....
    outArrIndices[....] = outVertexCount - 1;
  }

  // Making sure that the code above has no bugs   // <= LOL
  for (int i = 0; i < outArrIndices.size(); ++i)
  {
    if (outArrIndices[i] < 0)                      // <= LOL
    {
      return false;
    }
  }
  
  return true;
}

This error is worthy of a separate section. In general, in the CryEngine code, there are a lot of fragments where unsigned variables are pointlessly compared with zero. There are hundreds of such places, but this fragment deserves special attention, because the code was written deliberately.

So, there is an array of unsigned numbers - outArrIndices. Then the array is filled according to some algorithm. After that we see a brilliant check of every array element, so that none of them has a negative number. The array elements have the uint16 type.

Memory handling errors

V512 A call of the 'memcpy' function will lead to underflow of the buffer 'hashableData'. GeomCacheRenderNode.cpp 285

void CGeomCacheRenderNode::Render(....)
{
  ....
  CREGeomCache* pCREGeomCache = iter->second.m_pRenderElement;
  ....
  uint8 hashableData[] =
  {
    0, 0, 0, 0, 0, 0, 0, 0,
    (uint8)std::distance(pCREGeomCache->....->begin(), &meshData),
    (uint8)std::distance(meshData....->....begin(), &chunk),
    (uint8)std::distance(meshData.m_instances.begin(), &instance)
  };

  memcpy(hashableData, pCREGeomCache, sizeof(pCREGeomCache));
  ....
}

Pay attention to the arguments of the memcpy() function. The programmer plans to copy the object pCREGeomCache to the array hashableData, but he accidentally copies not the size of the object, but the size of the pointer using the sizeof operator. Due to the error, the object is not copied completely, only 4 or 8 bytes.

V568 It's odd that 'sizeof()' operator evaluates the size of a pointer to a class, but not the size of the 'this' class object. ClipVolumeManager.cpp 145

void
CClipVolumeManager::GetMemoryUsage(class ICrySizer* pSizer) const
{
  pSizer->AddObject(this, sizeof(this));
  for (size_t i = 0; i < m_ClipVolumes.size(); ++i)
    pSizer->AddObject(m_ClipVolumes[i].m_pVolume);
}

A similar mistake was made when the programmer evaluated the size of this pointer instead of the size of a class. Correct variant: sizeof(*this).

V530 The return value of function 'release' is required to be utilized. ClipVolumes.cpp 492

vector<unique_ptr<CFullscreenPass>> m_jitteredDepthPassArray;

void CClipVolumesStage::PrepareVolumetricFog()
{
  ....
  for (int32 i = 0; i < m_jitteredDepthPassArray.size(); ++i)
  {
    m_jitteredDepthPassArray[i].release();
  }

  m_jitteredDepthPassArray.resize(depth);

  for (int32 i = 0; i < depth; ++i)
  {
    m_jitteredDepthPassArray[i] = CryMakeUnique<....>();
    m_jitteredDepthPassArray[i]->SetViewport(viewport);
    m_jitteredDepthPassArray[i]->SetFlags(....);
  }
  ....
}

If we look at the documentation for the class std::unique_ptr, the release() function should be used as follows:

std::unique_ptr<Foo> up(new Foo());
Foo* fp = up.release();
delete fp;

Most likely, it was meant to use the reset() function instead of the release() one.

V549 The first argument of 'memcpy' function is equal to the second argument. ObjectsTree_Serialize.cpp 1135

void COctreeNode::LoadSingleObject(....)
{
  ....
  float* pAuxDataDst = pObj->GetAuxSerializationDataPtr(....);
  const float* pAuxDataSrc = StepData<float>(....);
  memcpy(pAuxDataDst, pAuxDataDst, min(....) * sizeof(float));
  ....
}

It was forgotten, to pass pAuxDataSrc to the memcpy() function. Instead of this, the same variable pAuxDataDst is used as both source and destination. No one is immune to errors.

By the way, those who are willing, may test their programming skills and attentiveness, by doing a quiz on the detection of similar bugs: q.viva64.com.

Strange code

V501 There are identical sub-expressions to the left and to the right of the '||' operator: val == 0 || val == - 0 XMLCPB_AttrWriter.cpp 363

void CAttrWriter::PackFloatInSemiConstType(float val, ....)
{
  uint32 type = PFSC_VAL;

  if (val == 0 || val == -0)  // <=
    type = PFSC_0;
  else if (val == 1)
    type = PFSC_1;
  else if (val == -1)
    type = PFSC_N1;

  ....
}

The developers planned to compare a real val variable with a positive zero and with a negative zero, but did this incorrectly. The values of zeros became the same after the integer constants were declared.

Most likely, the code should be corrected in the following way, by declaring real-type constants:

if (val == 0.0f || val == -0.0f)
    type = PFSC_0;

On the other hand, the conditional expression is redundant, as it is enough to compare the variable with a usual zero. This is why the code is executed in the way the programmer expected.

But, if it is necessary to identify the negative zero, then it would be more correct to do it with the std::signbit function.

V501 There are identical sub-expressions 'm_joints[i].limits[1][j]' to the left and to the right of the '-' operator. articulatedentity.cpp 1326

int CArticulatedEntity::Step(float time_interval)
{
  ....
  for (j=0;j<3;j++) if (!(m_joints[i].flags & angle0_locked<<j)&&
    isneg(m_joints[i].limits[0][j]-m_joints[i].qext[j]) +
    isneg(m_joints[i].qext[j]-m_joints[i].limits[1][j]) + 
    isneg(m_joints[i].limits[1][j]-m_joints[i].limits[1][j]) < 2)
  {
    ....
}

In the last part of the conditional expression there is subtraction of the variable m_joints[i].limits[1][j] from itself. The code looks suspicious. There are a lot of indexes in the expression, one of them probably has an error.

One more similar fragment:

V501 There are identical sub-expressions 'm_joints[op[1]].limits[1][i]' to the left and to the right of the '-' operator. articulatedentity.cpp 513

V590 Consider inspecting this expression. The expression is excessive or contains a misprint. GoalOp_Crysis2.cpp 3779

void COPCrysis2FlightFireWeapons::ParseParam(....)
{
  ....
  bool paused;
  value.GetValue(paused);

  if (paused && (m_State != eFP_PAUSED) &&
(m_State != eFP_PAUSED_OVERRIDE))
  {
    m_NextState = m_State;
    m_State = eFP_PAUSED;
    m_PausedTime = 0.0f;
    m_PauseOverrideTime = 0.0f;
  }
  else if (!paused && (m_State == eFP_PAUSED) &&        // <=
(m_State != eFP_PAUSED_OVERRIDE)) // <=
  {
    m_State = m_NextState;
    m_NextState = eFP_STOP;

    m_PausedTime = 0.0f;
    m_PauseOverrideTime = 0.0f;
  }
  ....
}

A conditional expression is written in such a way that the result does not depend on the subexpression m_State != eFP_PAUSED_OVERRIDE. But is it really worth speaking about here if this code fragment is still not fixed after the first article?

In case it is interesting, I have already described the same kind of errors in the article "Logical Expressions in C/C++. Mistakes Made by Professionals".

V529 Odd semicolon ';' after 'for' operator. boolean3d.cpp 1077

int CTriMesh::Slice(...)
{
  ....
  pmd->pMesh[0]=pmd->pMesh[1] = this;  AddRef();AddRef();
  for(pmd0=m_pMeshUpdate; pmd0->next; pmd0=pmd0->next); // <=
    pmd0->next = pmd;
  ....
}

One more code fragment that remained uncorrected since the last project check. But it is still unclear if this is a formatting error, or a mistake in logic.

About pointers

V522 Dereferencing of the null pointer 'pCEntity' might take place. BreakableManager.cpp 2396

int CBreakableManager::HandlePhysics_UpdateMeshEvent(....)
{
  CEntity* pCEntity = 0;
  ....
  if (pmu && pSrcStatObj && GetSurfaceType(pSrcStatObj))
  {
    ....
    if (pEffect)
    {
      ....
      if (normal.len2() > 0)
        pEffect->Spawn(true, pCEntity->GetSlotWorldTM(...); // <=
    }
  }

  ....

  if (iForeignData == PHYS_FOREIGN_ID_ENTITY)
  {
    pCEntity = (CEntity*)pForeignData;
    if (!pCEntity || !pCEntity->GetPhysicalProxy())
      return 1;
  }
  ....
}

The analyzer detected null pointer dereference. The code of the function is written or refactored in such a way that there is now a branch of code, where the pointer pCEntity will be, initialized by a zero.

Now let's have a look at the variant of a potential dereference of a null pointer.

V595 The 'pTrack' pointer was utilized before it was verified against nullptr. Check lines: 60, 61. AudioNode.cpp 60

void CAudioNode::Animate(SAnimContext& animContext)
{
  ....
  const bool bMuted = gEnv->IsEditor() && (pTrack->GetFlags() &
    IAnimTrack::eAnimTrackFlags_Muted);
  if (!pTrack || pTrack->GetNumKeys() == 0 ||
       pTrack->GetFlags() & IAnimTrack::eAnimTrackFlags_Disabled)
  {
    continue;
  }
  ....
}

The author of this code first used the pointer pTrack, but its validity is checked on the next line of code before the dereference. Most likely, this is not how the program should work.

There were a lot of V595 warnings, they won't really fit into the article. Very often, such code is a real error, but thanks to luck, the code works correctly.

V571 Recurring check. The 'if (rLightInfo.m_pDynTexture)' condition was already verified in line 69. ObjMan.cpp 70

// Safe memory helpers
#define SAFE_RELEASE(p){ if (p) { (p)->Release(); (p) = NULL; } }

void CObjManager::UnloadVegetationModels(bool bDeleteAll)
{
  ....
  SVegetationSpriteLightInfo& rLightInfo = ....;
  if (rLightInfo.m_pDynTexture)
    SAFE_RELEASE(rLightInfo.m_pDynTexture);
  ....
}

In this fragment there is no serious error, but it is not necessary to write extra code, if the corresponding checks are already included in the special macro.

One more fragment with redundant code:

V571 Recurring check. The 'if (m_pSectorGroups)' condition was already verified in line 48. PartitionGrid.cpp 50

V575 The 'memcpy' function doesn't copy the whole string. Use 'strcpy / strcpy_s' function to preserve terminal null. SystemInit.cpp 4045

class CLvlRes_finalstep : public CLvlRes_base
{
  ....
  for (;; )
  {
    if (*p == '/' || *p == '\\' || *p == 0)
    {
      char cOldChar = *p;
      *p = 0; // create zero termination
      _finddata_t fd;

      bool bOk = FindFile(szFilePath, szFile, fd);

      if (bOk)
        assert(strlen(szFile) == strlen(fd.name));

      *p = cOldChar; // get back the old separator

      if (!bOk)
        return;

      memcpy((void*)szFile, fd.name, strlen(fd.name)); // <=

      if (*p == 0)
        break;

      ++p;
      szFile = p;
    }
    else ++p;
  }
  ....
}

There might be an error in this code. The last terminal null is lost during the copying of the last string. In this case it is necessary to copy the strlen() + 1 symbol or use special functions for copying the strings: strcpy or strcpy_s.

Problems with a comma

V521 Such expressions using the ',' operator are dangerous. Make sure the expression '!sWords[iWord].empty(), iWord ++' is correct. TacticalPointSystem.cpp 3243

bool CTacticalPointSystem::Parse(....) const
{
  string sInput(sSpec);
  const int MAXWORDS = 8;
  string sWords[MAXWORDS];

  int iC = 0, iWord = 0;
  for (; iWord < MAXWORDS; !sWords[iWord].empty(), iWord++) // <=
  {
    sWords[iWord] = sInput.Tokenize("_", iC);
  }
  ....
}

Note the section of the for loop with the counters. What is a logic expression doing there? Most likely, it should be moved to the loop condition; thus we'll have the following code:

for (; iWord < MAXWORDS && !sWords[iWord].empty(); iWord++) {...}

V521 Such expressions using the ',' operator are dangerous. Make sure the expression is correct. HommingSwarmProjectile.cpp 187

void CHommingSwarmProjectile::HandleEvent(....)
{
  ....
  explodeDesc.normal = -pCollision->n,pCollision->vloc[0];
  ....
}

One more strange code fragment with the ',' operator.

Suspicious conditions

V571 Recurring check. The 'if (pos == npos)' condition was already verified in line 1530. CryString.h 1539

//! Find last single character.
// \return -1 if not found, distance from beginning otherwise.
template<class T>
inline typename CryStringT<T>::....::rfind(....) const
{
  const_str str;
  if (pos == npos)
  {
    // find last single character
    str = _strrchr(m_str, ch);
    // return -1 if not found, distance from beginning otherwise
    return (str == NULL) ?
      (size_type) - 1 : (size_type)(str - m_str);
  }
  else
  {
    if (pos == npos)
    {
      pos = length();
    }
    if (pos > length())
    {
      return npos;
    }

    value_type tmp = m_str[pos + 1];
    m_str[pos + 1] = 0;
    str = _strrchr(m_str, ch);
    m_str[pos + 1] = tmp;
  }
  return (str == NULL) ?
   (size_type) - 1 : (size_type)(str - m_str);
}

The analyzer detected a repeated check of the pos variable. A part of the code will never be executed because of this error. There is also duplicate code in the function, that's why this function is worth rewriting.

This code was successfully duplicated in another place:

V571 Recurring check. The 'if (pos == npos)' condition was already verified in line 1262. CryFixedString.h 1271

V523 The 'then' statement is equivalent to the 'else' statement. ScriptTable.cpp 789

bool CScriptTable::AddFunction(const SUserFunctionDesc& fd)
{
  ....
  char sFuncSignature[256];
  if (fd.sGlobalName[0] != 0)
    cry_sprintf(sFuncSignature, "%s.%s(%s)", fd.sGlobalName,
      fd.sFunctionName, fd.sFunctionParams);
  else
    cry_sprintf(sFuncSignature, "%s.%s(%s)", fd.sGlobalName,
      fd.sFunctionName, fd.sFunctionParams);
  ....
}

There is an attempt to print the string regardless of its content. There are many such fragments in the code, here are some of them:

V523 The 'then' statement is equivalent to the 'else' statement. BudgetingSystem.cpp 718
V523 The 'then' statement is equivalent to the 'else' statement. D3DShadows.cpp 627
V523 The 'then' statement is equivalent to the 'else' statement. livingentity.cpp 967

Undefined behavior

V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. physicalplaceholder.h 25

class CPhysicalEntity;
const int NO_GRID_REG = -1<<14;
const int GRID_REG_PENDING = NO_GRID_REG+1;
const int GRID_REG_LAST = NO_GRID_REG+2;

The analyzer can find several types of error which lead to undefined behavior. According to the latest standard of the language, the shift of a negative number to the left results in undefined behavior.

Here are some more dubious places:

V610 Undefined behavior. Check the shift operator '<<'. The left operand '~(TFragSeqStorage(0))' is negative. UDPDatagramSocket.cpp 757
V610 Undefined behavior. Check the shift operator '<<'. The right operand ('cpu' = [0..1023]) is greater than or equal to the length in bits of the promoted left operand. CryThreadUtil_posix.h 115
V610 Undefined behavior. Check the shift operator '>>'. The right operand is negative ('comp' = [-1..3]). ShaderComponents.cpp 399
V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. trimesh.cpp 4126
V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. trimesh.cpp 4559
V610 Unspecified behavior. Check the shift operator '>>'. The left operand '-NRAYS' is negative. trimesh.cpp 4618
V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. tetrlattice.cpp 324
V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. tetrlattice.cpp 350
V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. tetrlattice.cpp 617
V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. tetrlattice.cpp 622

Another type of undefined behavior is related to the repeated changes of a variable between two sequence points:

V567 Undefined behavior. The 'm_current' variable is modified while being used twice between sequence points. OperatorQueue.cpp 101

boolCOperatorQueue::Prepare(....)
{
  ++m_current &= 1;
  m_ops[m_current].clear();
  return true;
}

Unfortunately, this fragment is not the only one.

V567 Undefined behavior. The 'm_commandBufferIndex' variable is modified while being used twice between sequence points. XConsole.cpp 180
V567 Undefined behavior. The 'itail' variable is modified while being used twice between sequence points. trimesh.cpp 3119
V567 Undefined behavior. The 'ihead' variable is modified while being used twice between sequence points. trimesh.cpp 3126
V567 Undefined behavior. The 'ivtx' variable is modified while being used twice between sequence points. boolean3d.cpp 957
V567 Undefined behavior. The 'ivtx' variable is modified while being used twice between sequence points. boolean3d.cpp 965
V567 Undefined behavior. The 'ivtx' variable is modified while being used twice between sequence points. boolean3d.cpp 983
V567 Undefined behavior. The 'm_iNextAnimIndex' variable is modified while being used twice between sequence points. HitDeathReactionsDefs.cpp 192

Questions for the developers

In the CryEngine V code I saw quite an amusing way of communication between the developers with the help of comments.

Here is the most hilarious comment that I found with the help of the warning:

V763 Parameter 'enable' is always rewritten in function body before being used.

void CNetContext::EnableBackgroundPassthrough(bool enable)
{
  SCOPED_GLOBAL_LOCK;
  // THIS IS A TEMPORARY HACK TO MAKE THE GAME PLAY NICELY,
  // ASK peter@crytek WHY IT'S STILL HERE
  enable = false;
  ....
}

Further on, I decided to look for similar texts and note down a couple of them:

....
// please ask me when you want to change [tetsuji]
....
// please ask me when you want to change [dejan]
....
//if there are problems with this function, ask Ivo
uint32 numAnims = 
  pCharacter->GetISkeletonAnim()->GetNumAnimsInFIFO(layer);
if (numAnims)
  return pH->EndFunction(true);
....
//ask Ivo for details
//if (pCharacter->GetCurAnimation() &&
//    pCharacter->GetCurAnimation()[0] != '\0')
//  return pH->EndFunction(pCharacter->GetCurAnimation());
....
/////////////////////////////////////////////////////////////////
// Strange, !do not remove... ask Timur for the meaning of this.
/////////////////////////////////////////////////////////////////
if (m_nStrangeRatio > 32767)
{
  gEnv->pScriptSystem->SetGCFrequency(-1); // lets get nasty.
}
/////////////////////////////////////////////////////////////////
// Strange, !do not remove... ask Timur for the meaning of this.
/////////////////////////////////////////////////////////////////
if (m_nStrangeRatio > 1000)
{
  if (m_pProcess && (m_pProcess->GetFlags() & PROC_3DENGINE))
    m_nStrangeRatio += cry_random(1, 11);
}
/////////////////////////////////////////////////////////////////
....
// tank specific:
// avoid steering input around 0.5 (ask Anton)
....
CryWarning(VALIDATOR_MODULE_EDITOR, VALIDATOR_WARNING,
  "....: Wrong edited item. Ask AlexL to fix this.");
....
// If this renders black ask McJohn what's wrong.
glGenerateMipmap(GL_TEXTURE_2D);
....

The most important question to the developers: why don't they use specialized tools for the improvement of their code? Of course, I mean PVS-Studio. :)

I should note once again that this article provides only some of the errors we found. I didn't even get to the end of the High level warnings. So, the project is still waiting for those who may come and check it more thoroughly. Unfortunately, I cannot spend that much time, because dozens of other projects are waiting for me.

Conclusion

Having worked on the development of an analyzer, I came to the conclusion that it is just impossible to avoid errors, if the team increases or decreases in size. I am really not against Code Review, but it's not hard to count the amount of time that a team lead will have to spend reviewing the code of ten people. What about the next day? What if the number of developers is more than 10? In this case, the Code Review would only be necessary when editing key components of the product. This approach would be extremely ineffective if there is more code, and more people, in a team. The automated check of the code with the help of static analyzers will greatly help the situation. It is not a substitute for the existing tests, but a completely different approach to the code quality (by the way, static analyzers find errors in the tests too). Fixing bugs at the earliest stages of development doesn't really cost anything, unlike those that are found during the testing phase; the errors in the released product may have enormous cost.

You may download and try PVS-Studio by this link.

In case you want to discuss the licensing options, prices, and discounts, contact us at support.

Don't make the unicorn sad by writing bad code...

Sunday, February 12, 2017

Incremental analysis in PVS-Studio: now on the build server

Introduction

When implementing a static analyzer to the existing development process, a team may encounter certain difficulties. For example, it is very useful to check the modified code before it gets into the version control system. However, performing a static analysis in this case may require quite a long time, especially for projects with a large code base. In this article we are going to look at the incremental analysis mode of PVS-Studio analyzer that allows checking only modified files, which significantly reduces the time for analysis. Therefore, developers will be able to use static analysis as often as it is necessary to minimize the risks of erroneous code getting into the version control system. There were several reasons for writing this article. Firstly, it was a wish to tell once more about this useful feature of our analyzer; secondly, the fact that we have completely rewritten the incremental analysis mechanism and added the support of this mode to the command line version of our analyzer.

Scenarios of using a static analyzer

Any approach to improving the quality of software assumes that the defects must be detected as early as possible. Ideally, we should write the code without errors from the very beginning, but this practice works only in one corporation:

Why is it so important to detect and fix bugs as soon as possible? I will not speak about such banal things as reputational risks that will inevitably arise if your users start massively reporting about defects in your software. Let's focus specifically on the economic component of fixing errors in your code. We have no statistics on the average price of errors. The errors may be very different, they may be detected on various stages of the software life cycle, the software may be used in various subject areas, where errors are critical and not. Although, we do not know the average cost of eliminating a defect depending on the software life cycle, in the industry as a whole, we can estimate the dynamics of this cost, using a widely known "1-10-100" rule. As applied to the software development, this rule states that if the cost of the elimination of the defect on the development phase is 1$, then after passing this code to the testers it grows to 10$ and will rise up to 100$ once the defected code is in the production.

There are a lot reasons for such rapid growth of prices for the defect correction, for example:

Changes in one fragment of the code can affect a lot of other parts of the application.
Completing once more the tasks that were already closed: changes in the design, coding, editing the documentation and so on.
The delivery of the fixed version to the users and the necessity to persuade them to update.

Understanding the importance of fixing the bugs on the earliest stages of the software life cycle, we offer our clients to use a two-level scheme of checking the code by a static analyzer. The first level is to test the code on the computer of the developer before the code gets into the version control system. That is, the developer writes some code and immediately inspects it with a static analyzer. For this purpose we have a plugin for Microsoft Visual Studio (supports all versions from 2010 to 2015 included). The plugin allows you to check one or more source code files, projects or the entire solution.

The second level of protection is to run the static analyzer during the overnight project build. This helps to make sure that new errors were not added to the version control system, or to take the necessary actions to correct these errors. In order to minimize the risk of having errors in the later stages of the software life cycle, we suggest using both levels of static analysis execution: locally on the machines of developers and on a centralized server of continuous integration.

This approach cannot be called completely flawless and doesn't guarantee that the errors, detected by a static analyzer won't get into a version control system, or at the worst will be fixed before the build goes to the testing phase. The compulsion to perform static analysis manually before committing the code will probably face strong resistance. Firstly, if the project has a large code base, no one would want to sit and wait until the project is checked. Or, in case the developer decides to save time and check only those files that were modified by him, he will have to keep track of the edited files, which no one will do, of course. If we consider the build server, where we have, besides the overnight builds the build after the detection of modifications in the version control system, then doing the static analysis of the whole codebase during numerous daily builds also is not possible because static analysis will take a lot of time.

The incremental analysis mode, which allows scanning only recently changed files, helps to solve these problems. Let's consider in more detail, what advantages can the incremental analysis mode bring when used on a computers of developers and the build server.

Incremental analysis on the computer of a developer - a barrier against bugs before they get into a version control system

If a development team decides to use static code analysis and it is done only on the build server, during the overnight builds, for example, then sooner or later the developers will start treating this tool as an enemy. No wonder, because all the team members will see what mistakes their colleagues make. We strive to ensure that all stakeholders perceive the static analyzer as a friend and as a useful tool that helps to improve the quality of the code. If you want the errors detected by the analyzer not to get to the version control system and become publicly seen, the static analysis should be performed on the machines of the developers, so that the possible issues are detected as early as possible.

As I have already mentioned, the manual start of static analysis on the whole codebase may require quite a long time. If the developer has to memorize which files he has worked on, it can become quite annoying too.

The incremental analysis allows both to decrease the time for static analysis and to get rid of the necessity to start the analysis manually. The incremental analysis starts automatically after the build of a project or a solution. We consider this trigger to be the most appropriate for the analysis. It's logical to check if the project gets built before committing changes in the version control system. Thus, the incremental analysis mode allows you to get rid of the annoying actions caused by the need to manually perform static analysis on the computer of the developer. Incremental analysis works in the background mode, so the developer can continue working on the code without waiting for the analysis to finish.

You can enable the after build incremental analysis mode in the menu PVS-Studio/Incremental Analysis After Build (Modified Files Only), this option is enabled in PVS-Studio by default:

Once the incremental analysis mode is enabled, PVS-Studio will analyze all the modified files automatically in the background after the project is built. If PVS-Studio detects such modifications, incremental analysis will be launched automatically, and an animated PVS-Studio icon will appear in the notification area:

You may find more details related to the use of incremental analysis in the article PVS-Studio's incremental analysis mode.

Incremental analysis on the continuous integration server - an additional barrier against bugs

The continuous integration presupposes that the project is built on the build server after every commit to the version control system. As a rule, besides the project build, the existing set of unit tests gets executed. In addition to unit-tests, a lot of teams having the practice of continuous integration, use a build server to ensure processes of continuous quality control. For example, besides running unit and integration tests, these processes may include static and dynamic analysis, performance measurement and so on.

One of the important requirements for the tasks performed on the continuous integration server, is that the project build and execution of any additional actions must be very fast, so that the team can quickly respond to the detected problems. Performing static analysis on a large code base after each commit to the version control system contradicts to this requirement, because it may take a very long time. We couldn't put up with such limitations, that's why after reviewing our Visual Studio plugin, where we have had the incremental analysis mode for quite a long time, we asked ourselves: why not implement the same mode in the command line module PVS-Studio_Cmd.exe?

No sooner said than done, and in our module for command line which is designed to integrate static code analyzer into various build systems, there is now an incremental analysis mode. This mode works just like incremental analysis in the plugin.

Thus, with the added support of incremental analysis in PVS-Studio_Cmd.exe it is possible to use our static analyzer in a continuous integration system during numerous daily builds. Due to the fact that only modified files will be checked since the last update of the code from the version control system, static analysis will be done very quickly, and the duration of the project build will practically stay the same.

To activate the incremental analysis mode for the command line module PVS-Studio_Cmd.exe, specify the key --incremental and set one of the following modes:

Scan - analyze all dependencies to determine, which files will be analyzed incrementally. There will be no immediate analysis.
Analyze - perform incremental analysis. This step should be done after Scan and can be performed both before and after the build of the solution or the project. The static analysis will be performed only for the files that have been modified since the previous build.
ScanAndAnalyze - analyze all the dependencies to determine which files should be analyzed incrementally and perform incremental analysis of the edited files with the source code.

To get more detailed information about the incremental analysis mode in the command line module PVS-Studio_Cmd.exe, read the articles PVS-Studio's incremental analysis mode and Analyzing Visual C++ (.vcxproj) and Visual C# (.csproj) projects from the command line

I should also note that the BlameNotifier utility, that is included in the distribution of PVS-Studio, greatly compliments the functionality of incremental analysis.
This utility interacts with popular VCSs (currently supported systems are Git, Svn and Mercurial) to get information about those developers who committed the erroneous code and send notifications to them.

Thus we recommend the following scenario of using the analyzer on the continuous integration server:

to perform incremental analysis for numerous daily builds so that you can control the code quality only for the newly modified files;
it is advisable to perform the analysis of the whole code base during the overnight builds to have the full information about defects in the code.

Peculiarities of the incremental analysis implementation in PVS-Studio

As I have already noted, the incremental analysis mode has existed in the PVS-Studio plugin of Visual Studio for a long time. In the plugin the detection of the modified files to be analyzed incrementally, was implemented with the help of COM-wrappers of Visual Studio. Such an approach is absolutely impossible for implementing the incremental analysis feature in the command line version of our analyzer, as it is completely independent of the inner infrastructure of Visual Studio. It's not the best idea to support different implementations having the same functions in different components, that's why we decided right away that in the plugin for Visual Studio and in the command line utility PVS-Studio_Cmd.exe we'll use the common code base.

Theoretically it's not a hard task to detect modified files since the last build of a project. To solve it, we need to get the time of modification of the target binary file and the modification of all files, involved in the building of the target binary file. Those files with the source code that were modified later than the target file, should be added to the list of files for the incremental analysis. However, things are more complicated in the real world. In particular, for projects implemented in C or C++, it is very difficult to identify all of the files involved in the build of the target file, for example, those header files, which were included directly in code, and are absent in the project file. Here I should note that under Windows, both our Visual Studio plugin (which is obvious) and the command line version PVS-Studio_Cmd.exe support only the analysis of MSBuild projects. This fact greatly simplified our task. It is also worth mentioning that in the Linux version of PVS-Studio you can also use incremental analysis - it works there "out of the box": when you use compile monitoring, only the built files will be analyzed. Accordingly, the incremental analysis will start doing the incremental build; the situation will be the same during the direct integration to the build system (for example, to make files).

MSBuild provides a mechanism for tracking accesses to the file system (File Tracking). For the incremental build of projects, implemented in C and C++, the correspondence between the source files (for example, cpp-fies, header files) and the target files are written to the *.tlog-files. For example, for the CL task, the paths to all the source files, read by the compiler, will be written to the file CL.read.{ID}.tlog, and the paths to the target files will be saved in the CL.write.{ID}.tlog file.

So in the CL.*.tlog files we already have all the information about the source files that have been compiled, and the target files. The task gets gradually simpler. However, we still have the task to traverse all the source and target files and compare dates of their modifications. Can we simplify it even more? Of course! In the Microsoft.Build.Utilities namespace we find classes CanonicalTrackedInputFiles and CanonicalTrackedOutputFiles that are responsible for the work with files CL.read.*.tlog and CL.write.*.tlog accordingly. Having created instances of these classes, and using the method CanonicalTrackedInputFiles.ComputeSourcesNeedingCompilation, we get a list of source files for compilation, based on the analysis of the target files and the dependency graph of the source files.

Let's have a look at an example of the code that allows to get a list of files, for which we should perform incremental analysis using this approach. In this example, sourceFiles is a collection of full normalized paths to all the source files of the project, tlogDirectoryPath is a path to the directory, where the *.tlog-files are located.

var sourceFileTaskItems =
    new ITaskItem[sourceFiles.Count];

for (var index = 0; index < sourceFiles.Count; index++)
    sourceFileTaskItems[index] =
        new TaskItem(sourceFiles[index]);

var tLogWriteFiles =
    GetTaskItemsFromPath("CL.write.*", tlogDirectoryPath);
var tLogReadFiles =
    GetTaskItemsFromPath("CL.read.*", tlogDirectoryPath);

var trackedOutputFiles =
    new CanonicalTrackedOutputFiles(tLogWriteFiles);
var trackedInputFiles =
    new CanonicalTrackedInputFiles(tLogReadFiles,
        sourceFileTaskItems, trackedOutputFiles, false, false);

ITaskItem[] sourcesForIncrementalBuild =
    trackedInputFiles.ComputeSourcesNeedingCompilation(true);

Thus, using standard tools of MSBuild, we managed to make the mechanism of identification of files for incremental analysis the same as the inner mechanism of MSBuild for incremental build, which provides a high quality of this approach.

Conclusion

In this article we had to look at the advantages of using incremental analysis on the machines of the developers and on the build server. Also we "looked under the hood" and got to know how to detect files for incremental analysis using MSBuild abilities. For all who are interested, I suggest downloading the trial version of our PVS-Studio analyzer and see what can be detected in your projects. Bugless code to you!