Tuesday, April 4, 2017

Critical errors in CryEngine V code

In May 2016, German game-development company Crytek made the, decision to upload the source code of their game engine, 'CryEngine V' to GitHub. The project is in active development, which leads to a large number of errors in the code. We have already checked the project with PVS-Studio for Windows, and now we can also analyze it using PVS-Studio for Linux. There was enough material for an article with the description of only crucial errors.
Picture 2

Introduction

CryEngine is a game engine created by the German company Crytek in the year 2002, and originally used in the first-person shooter Far Cry. There are a lot of great games made on the basis of different versions of CryEngine, by many studios who have licensed the engine: Far Cry, Crysis, Entropia Universe, Blue Mars, Warface, Homefront: The Revolution, Sniper: Ghost Warrior, Armored Warfare, Evolve and many others. In March 2016 the Crytek company announced the release of the new CryEngine V, and soon after, posted the source code on GitHub.


To perform the source code analysis, we used the PVS-Studio for Linux. Now, it has become even more convenient for the developers of cross-platform projects to track the quality of their code, with one static analysis tool. The Linux version can be downloaded as an archive, or a package for a package manager. You can set up the installation and update for the majority of distributions, using our repository.
This article only covers the general analysis warnings, and only the "High" certainty level (there are also "Medium" and "Low"). To be honest, I didn't even examine all of the "High" level warnings, because there was already enough material for an article after even a quick look. I started working on the article several times over a period of a few months, so I can say with certainty that the bugs described here have living in the code for some months already. Some of the bugs that had been found during the previous check of the project, also weren't fixed.
It was very easy to download and check the source code in Linux. Here is a list of all necessary commands:
mkdir ~/projects && cd ~/projects
git clone https://github.com/CRYTEK/CRYENGINE.git
cd CRYENGINE/
git checkout main
chmod +x ./download_sdks.py
./download_sdks.py
pvs-studio-analyzer trace -- \
  sh ./cry_waf.sh build_linux_x64_clang_profile -p gamesdk
pvs-studio-analyzer analyze \
  -l /path/to/PVS-Studio.lic \
  -o ~/projects/CRYENGINE/cryengine.log \
  -r ~/projects/CRYENGINE/ \
  -C clang++-3.8 -C clang-3.8 \
  -e ~/projects/CRYENGINE/Code/SDKs \
  -j4

plog-converter -a GA:1,2 -t tasklist \
  -o ~/projects/CRYENGINE/cryengine_ga.tasks \
  ~/projects/CRYENGINE/cryengine.log
The report file cryengine_ga.tasks can be opened and viewed in QtCreator. What did we manage to find in the source code of CryEngine V?

A strange Active() function

V501 There are identical sub-expressions to the left and to the right of the '==' operator: bActive == bActive LightEntity.h 124
void SetActive(bool bActive)
{
  if (bActive == bActive)
    return;

  m_bActive = bActive;
  OnResetState();
}
The function does nothing because of a typo. It seems to me that if there was a contest, "Super Typo", this code fragment would definitely take first place. I think this error has every chance to get into the section, "C/C++ bugs of the month".
But that's not all, here is a function from another class:
V501 There are identical sub-expressions 'm_staticObjects' to the left and to the right of the '||' operator. FeatureCollision.h 66
class CFeatureCollision : public CParticleFeature
{
public:
  CRY_PFX2_DECLARE_FEATURE

public:
  CFeatureCollision();
  ....

  bool  IsActive() const  { return m_terrain ||
m_staticObjects ||
m_staticObjects; }
  ....
  bool m_terrain;
  bool m_staticObjects;
  bool m_dynamicObjects;
};
The variable m_staticObjects is used twice in the function IsActive(), although there is an unused variable m_dynamicObjects. Perhaps, it was this variable that was meant to be used.

Code above has no bugs

V547 Expression 'outArrIndices[i] < 0' is always false. Unsigned type value is never < 0. CGFLoader.cpp 881
static bool CompactBoneVertices(....,
  DynArray<uint16>& outArrIndices, ....)           // <= uint16
{
  ....
  outArrIndices.resize(3 * inFaceCount, -1);

  int outVertexCount = 0;
  for (int i = 0; i < verts.size(); ++i)
  {
    ....
    outArrIndices[....] = outVertexCount - 1;
  }

  // Making sure that the code above has no bugs   // <= LOL
  for (int i = 0; i < outArrIndices.size(); ++i)
  {
    if (outArrIndices[i] < 0)                      // <= LOL
    {
      return false;
    }
  }
  
  return true;
}
This error is worthy of a separate section. In general, in the CryEngine code, there are a lot of fragments where unsigned variables are pointlessly compared with zero. There are hundreds of such places, but this fragment deserves special attention, because the code was written deliberately.
So, there is an array of unsigned numbers - outArrIndices. Then the array is filled according to some algorithm. After that we see a brilliant check of every array element, so that none of them has a negative number. The array elements have the uint16 type.

Memory handling errors

V512 A call of the 'memcpy' function will lead to underflow of the buffer 'hashableData'. GeomCacheRenderNode.cpp 285
void CGeomCacheRenderNode::Render(....)
{
  ....
  CREGeomCache* pCREGeomCache = iter->second.m_pRenderElement;
  ....
  uint8 hashableData[] =
  {
    0, 0, 0, 0, 0, 0, 0, 0,
    (uint8)std::distance(pCREGeomCache->....->begin(), &meshData),
    (uint8)std::distance(meshData....->....begin(), &chunk),
    (uint8)std::distance(meshData.m_instances.begin(), &instance)
  };

  memcpy(hashableData, pCREGeomCache, sizeof(pCREGeomCache));
  ....
}
Pay attention to the arguments of the memcpy() function. The programmer plans to copy the object pCREGeomCache to the array hashableData, but he accidentally copies not the size of the object, but the size of the pointer using the sizeof operator. Due to the error, the object is not copied completely, only 4 or 8 bytes.
V568 It's odd that 'sizeof()' operator evaluates the size of a pointer to a class, but not the size of the 'this' class object. ClipVolumeManager.cpp 145
void
CClipVolumeManager::GetMemoryUsage(class ICrySizer* pSizer) const
{
  pSizer->AddObject(this, sizeof(this));
  for (size_t i = 0; i < m_ClipVolumes.size(); ++i)
    pSizer->AddObject(m_ClipVolumes[i].m_pVolume);
}
A similar mistake was made when the programmer evaluated the size of this pointer instead of the size of a class. Correct variant: sizeof(*this).
V530 The return value of function 'release' is required to be utilized. ClipVolumes.cpp 492
vector<unique_ptr<CFullscreenPass>> m_jitteredDepthPassArray;

void CClipVolumesStage::PrepareVolumetricFog()
{
  ....
  for (int32 i = 0; i < m_jitteredDepthPassArray.size(); ++i)
  {
    m_jitteredDepthPassArray[i].release();
  }

  m_jitteredDepthPassArray.resize(depth);

  for (int32 i = 0; i < depth; ++i)
  {
    m_jitteredDepthPassArray[i] = CryMakeUnique<....>();
    m_jitteredDepthPassArray[i]->SetViewport(viewport);
    m_jitteredDepthPassArray[i]->SetFlags(....);
  }
  ....
}
If we look at the documentation for the class std::unique_ptr, the release() function should be used as follows:
std::unique_ptr<Foo> up(new Foo());
Foo* fp = up.release();
delete fp;
Most likely, it was meant to use the reset() function instead of the release() one.
V549 The first argument of 'memcpy' function is equal to the second argument. ObjectsTree_Serialize.cpp 1135
void COctreeNode::LoadSingleObject(....)
{
  ....
  float* pAuxDataDst = pObj->GetAuxSerializationDataPtr(....);
  const float* pAuxDataSrc = StepData<float>(....);
  memcpy(pAuxDataDst, pAuxDataDst, min(....) * sizeof(float));
  ....
}
It was forgotten, to pass pAuxDataSrc to the memcpy() function. Instead of this, the same variable pAuxDataDst is used as both source and destination. No one is immune to errors.
By the way, those who are willing, may test their programming skills and attentiveness, by doing a quiz on the detection of similar bugs: q.viva64.com.

Strange code

V501 There are identical sub-expressions to the left and to the right of the '||' operator: val == 0 || val == - 0 XMLCPB_AttrWriter.cpp 363
void CAttrWriter::PackFloatInSemiConstType(float val, ....)
{
  uint32 type = PFSC_VAL;

  if (val == 0 || val == -0)  // <=
    type = PFSC_0;
  else if (val == 1)
    type = PFSC_1;
  else if (val == -1)
    type = PFSC_N1;

  ....
}
The developers planned to compare a real val variable with a positive zero and with a negative zero, but did this incorrectly. The values of zeros became the same after the integer constants were declared.
Most likely, the code should be corrected in the following way, by declaring real-type constants:
if (val == 0.0f || val == -0.0f)
    type = PFSC_0;
On the other hand, the conditional expression is redundant, as it is enough to compare the variable with a usual zero. This is why the code is executed in the way the programmer expected.
But, if it is necessary to identify the negative zero, then it would be more correct to do it with the std::signbit function.
V501 There are identical sub-expressions 'm_joints[i].limits[1][j]' to the left and to the right of the '-' operator. articulatedentity.cpp 1326
int CArticulatedEntity::Step(float time_interval)
{
  ....
  for (j=0;j<3;j++) if (!(m_joints[i].flags & angle0_locked<<j)&&
    isneg(m_joints[i].limits[0][j]-m_joints[i].qext[j]) +
    isneg(m_joints[i].qext[j]-m_joints[i].limits[1][j]) + 
    isneg(m_joints[i].limits[1][j]-m_joints[i].limits[1][j]) < 2)
  {
    ....
}
In the last part of the conditional expression there is subtraction of the variable m_joints[i].limits[1][j] from itself. The code looks suspicious. There are a lot of indexes in the expression, one of them probably has an error.
One more similar fragment:
  • V501 There are identical sub-expressions 'm_joints[op[1]].limits[1][i]' to the left and to the right of the '-' operator. articulatedentity.cpp 513
V590 Consider inspecting this expression. The expression is excessive or contains a misprint. GoalOp_Crysis2.cpp 3779
void COPCrysis2FlightFireWeapons::ParseParam(....)
{
  ....
  bool paused;
  value.GetValue(paused);

  if (paused && (m_State != eFP_PAUSED) &&
(m_State != eFP_PAUSED_OVERRIDE))
  {
    m_NextState = m_State;
    m_State = eFP_PAUSED;
    m_PausedTime = 0.0f;
    m_PauseOverrideTime = 0.0f;
  }
  else if (!paused && (m_State == eFP_PAUSED) &&        // <=
(m_State != eFP_PAUSED_OVERRIDE)) // <=
  {
    m_State = m_NextState;
    m_NextState = eFP_STOP;

    m_PausedTime = 0.0f;
    m_PauseOverrideTime = 0.0f;
  }
  ....
}
A conditional expression is written in such a way that the result does not depend on the subexpression m_State != eFP_PAUSED_OVERRIDE. But is it really worth speaking about here if this code fragment is still not fixed after the first article?
In case it is interesting, I have already described the same kind of errors in the article "Logical Expressions in C/C++. Mistakes Made by Professionals".
V529 Odd semicolon ';' after 'for' operator. boolean3d.cpp 1077
int CTriMesh::Slice(...)
{
  ....
  pmd->pMesh[0]=pmd->pMesh[1] = this;  AddRef();AddRef();
  for(pmd0=m_pMeshUpdate; pmd0->next; pmd0=pmd0->next); // <=
    pmd0->next = pmd;
  ....
}
One more code fragment that remained uncorrected since the last project check. But it is still unclear if this is a formatting error, or a mistake in logic.

About pointers

V522 Dereferencing of the null pointer 'pCEntity' might take place. BreakableManager.cpp 2396
int CBreakableManager::HandlePhysics_UpdateMeshEvent(....)
{
  CEntity* pCEntity = 0;
  ....
  if (pmu && pSrcStatObj && GetSurfaceType(pSrcStatObj))
  {
    ....
    if (pEffect)
    {
      ....
      if (normal.len2() > 0)
        pEffect->Spawn(true, pCEntity->GetSlotWorldTM(...); // <=
    }
  }

  ....

  if (iForeignData == PHYS_FOREIGN_ID_ENTITY)
  {
    pCEntity = (CEntity*)pForeignData;
    if (!pCEntity || !pCEntity->GetPhysicalProxy())
      return 1;
  }
  ....
}
The analyzer detected null pointer dereference. The code of the function is written or refactored in such a way that there is now a branch of code, where the pointer pCEntity will beinitialized by a zero.
Now let's have a look at the variant of a potential dereference of a null pointer.
V595 The 'pTrack' pointer was utilized before it was verified against nullptr. Check lines: 60, 61. AudioNode.cpp 60
void CAudioNode::Animate(SAnimContext& animContext)
{
  ....
  const bool bMuted = gEnv->IsEditor() && (pTrack->GetFlags() &
    IAnimTrack::eAnimTrackFlags_Muted);
  if (!pTrack || pTrack->GetNumKeys() == 0 ||
       pTrack->GetFlags() & IAnimTrack::eAnimTrackFlags_Disabled)
  {
    continue;
  }
  ....
}
The author of this code first used the pointer pTrack, but its validity is checked on the next line of code before the dereference. Most likely, this is not how the program should work.
There were a lot of V595 warnings, they won't really fit into the article. Very often, such code is a real error, but thanks to luck, the code works correctly.
V571 Recurring check. The 'if (rLightInfo.m_pDynTexture)' condition was already verified in line 69. ObjMan.cpp 70
// Safe memory helpers
#define SAFE_RELEASE(p){ if (p) { (p)->Release(); (p) = NULL; } }

void CObjManager::UnloadVegetationModels(bool bDeleteAll)
{
  ....
  SVegetationSpriteLightInfo& rLightInfo = ....;
  if (rLightInfo.m_pDynTexture)
    SAFE_RELEASE(rLightInfo.m_pDynTexture);
  ....
}
In this fragment there is no serious error, but it is not necessary to write extra code, if the corresponding checks are already included in the special macro.
One more fragment with redundant code:
  • V571 Recurring check. The 'if (m_pSectorGroups)' condition was already verified in line 48. PartitionGrid.cpp 50
V575 The 'memcpy' function doesn't copy the whole string. Use 'strcpy / strcpy_s' function to preserve terminal null. SystemInit.cpp 4045
class CLvlRes_finalstep : public CLvlRes_base
{
  ....
  for (;; )
  {
    if (*p == '/' || *p == '\\' || *p == 0)
    {
      char cOldChar = *p;
      *p = 0; // create zero termination
      _finddata_t fd;

      bool bOk = FindFile(szFilePath, szFile, fd);

      if (bOk)
        assert(strlen(szFile) == strlen(fd.name));

      *p = cOldChar; // get back the old separator

      if (!bOk)
        return;

      memcpy((void*)szFile, fd.name, strlen(fd.name)); // <=

      if (*p == 0)
        break;

      ++p;
      szFile = p;
    }
    else ++p;
  }
  ....
}
There might be an error in this code. The last terminal null is lost during the copying of the last string. In this case it is necessary to copy the strlen() + 1 symbol or use special functions for copying the strings: strcpy or strcpy_s.

Problems with a comma

V521 Such expressions using the ',' operator are dangerous. Make sure the expression '!sWords[iWord].empty(), iWord ++' is correct. TacticalPointSystem.cpp 3243
bool CTacticalPointSystem::Parse(....) const
{
  string sInput(sSpec);
  const int MAXWORDS = 8;
  string sWords[MAXWORDS];

  int iC = 0, iWord = 0;
  for (; iWord < MAXWORDS; !sWords[iWord].empty(), iWord++) // <=
  {
    sWords[iWord] = sInput.Tokenize("_", iC);
  }
  ....
}
Note the section of the for loop with the counters. What is a logic expression doing there? Most likely, it should be moved to the loop condition; thus we'll have the following code:
for (; iWord < MAXWORDS && !sWords[iWord].empty(); iWord++) {...}
V521 Such expressions using the ',' operator are dangerous. Make sure the expression is correct. HommingSwarmProjectile.cpp 187
void CHommingSwarmProjectile::HandleEvent(....)
{
  ....
  explodeDesc.normal = -pCollision->n,pCollision->vloc[0];
  ....
}
One more strange code fragment with the ',' operator.

Suspicious conditions

V571 Recurring check. The 'if (pos == npos)' condition was already verified in line 1530. CryString.h 1539
//! Find last single character.
// \return -1 if not found, distance from beginning otherwise.
template<class T>
inline typename CryStringT<T>::....::rfind(....) const
{
  const_str str;
  if (pos == npos)
  {
    // find last single character
    str = _strrchr(m_str, ch);
    // return -1 if not found, distance from beginning otherwise
    return (str == NULL) ?
      (size_type) - 1 : (size_type)(str - m_str);
  }
  else
  {
    if (pos == npos)
    {
      pos = length();
    }
    if (pos > length())
    {
      return npos;
    }

    value_type tmp = m_str[pos + 1];
    m_str[pos + 1] = 0;
    str = _strrchr(m_str, ch);
    m_str[pos + 1] = tmp;
  }
  return (str == NULL) ?
   (size_type) - 1 : (size_type)(str - m_str);
}
The analyzer detected a repeated check of the pos variable. A part of the code will never be executed because of this error. There is also duplicate code in the function, that's why this function is worth rewriting.
This code was successfully duplicated in another place:
  • V571 Recurring check. The 'if (pos == npos)' condition was already verified in line 1262. CryFixedString.h 1271
V523 The 'then' statement is equivalent to the 'else' statement. ScriptTable.cpp 789
bool CScriptTable::AddFunction(const SUserFunctionDesc& fd)
{
  ....
  char sFuncSignature[256];
  if (fd.sGlobalName[0] != 0)
    cry_sprintf(sFuncSignature, "%s.%s(%s)", fd.sGlobalName,
      fd.sFunctionName, fd.sFunctionParams);
  else
    cry_sprintf(sFuncSignature, "%s.%s(%s)", fd.sGlobalName,
      fd.sFunctionName, fd.sFunctionParams);
  ....
}
There is an attempt to print the string regardless of its content. There are many such fragments in the code, here are some of them:
  • V523 The 'then' statement is equivalent to the 'else' statement. BudgetingSystem.cpp 718
  • V523 The 'then' statement is equivalent to the 'else' statement. D3DShadows.cpp 627
  • V523 The 'then' statement is equivalent to the 'else' statement. livingentity.cpp 967

Undefined behavior

V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. physicalplaceholder.h 25
class CPhysicalEntity;
const int NO_GRID_REG = -1<<14;
const int GRID_REG_PENDING = NO_GRID_REG+1;
const int GRID_REG_LAST = NO_GRID_REG+2;
The analyzer can find several types of error which lead to undefined behavior. According to the latest standard of the language, the shift of a negative number to the left results in undefined behavior.
Here are some more dubious places:
  • V610 Undefined behavior. Check the shift operator '<<'. The left operand '~(TFragSeqStorage(0))' is negative. UDPDatagramSocket.cpp 757
  • V610 Undefined behavior. Check the shift operator '<<'. The right operand ('cpu' = [0..1023]) is greater than or equal to the length in bits of the promoted left operand. CryThreadUtil_posix.h 115
  • V610 Undefined behavior. Check the shift operator '>>'. The right operand is negative ('comp' = [-1..3]). ShaderComponents.cpp 399
  • V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. trimesh.cpp 4126
  • V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. trimesh.cpp 4559
  • V610 Unspecified behavior. Check the shift operator '>>'. The left operand '-NRAYS' is negative. trimesh.cpp 4618
  • V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. tetrlattice.cpp 324
  • V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. tetrlattice.cpp 350
  • V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. tetrlattice.cpp 617
  • V610 Undefined behavior. Check the shift operator '<<'. The left operand '-1' is negative. tetrlattice.cpp 622
Another type of undefined behavior is related to the repeated changes of a variable between two sequence points:
V567 Undefined behavior. The 'm_current' variable is modified while being used twice between sequence points. OperatorQueue.cpp 101
boolCOperatorQueue::Prepare(....)
{
  ++m_current &= 1;
  m_ops[m_current].clear();
  return true;
}
Unfortunately, this fragment is not the only one.
  • V567 Undefined behavior. The 'm_commandBufferIndex' variable is modified while being used twice between sequence points. XConsole.cpp 180
  • V567 Undefined behavior. The 'itail' variable is modified while being used twice between sequence points. trimesh.cpp 3119
  • V567 Undefined behavior. The 'ihead' variable is modified while being used twice between sequence points. trimesh.cpp 3126
  • V567 Undefined behavior. The 'ivtx' variable is modified while being used twice between sequence points. boolean3d.cpp 957
  • V567 Undefined behavior. The 'ivtx' variable is modified while being used twice between sequence points. boolean3d.cpp 965
  • V567 Undefined behavior. The 'ivtx' variable is modified while being used twice between sequence points. boolean3d.cpp 983
  • V567 Undefined behavior. The 'm_iNextAnimIndex' variable is modified while being used twice between sequence points. HitDeathReactionsDefs.cpp 192

Questions for the developers

In the CryEngine V code I saw quite an amusing way of communication between the developers with the help of comments.
Here is the most hilarious comment that I found with the help of the warning:
V763 Parameter 'enable' is always rewritten in function body before being used.
void CNetContext::EnableBackgroundPassthrough(bool enable)
{
  SCOPED_GLOBAL_LOCK;
  // THIS IS A TEMPORARY HACK TO MAKE THE GAME PLAY NICELY,
  // ASK peter@crytek WHY IT'S STILL HERE
  enable = false;
  ....
}
Further on, I decided to look for similar texts and note down a couple of them:
....
// please ask me when you want to change [tetsuji]
....
// please ask me when you want to change [dejan]
....
//if there are problems with this function, ask Ivo
uint32 numAnims = 
  pCharacter->GetISkeletonAnim()->GetNumAnimsInFIFO(layer);
if (numAnims)
  return pH->EndFunction(true);
....
//ask Ivo for details
//if (pCharacter->GetCurAnimation() &&
//    pCharacter->GetCurAnimation()[0] != '\0')
//  return pH->EndFunction(pCharacter->GetCurAnimation());
....
/////////////////////////////////////////////////////////////////
// Strange, !do not remove... ask Timur for the meaning of this.
/////////////////////////////////////////////////////////////////
if (m_nStrangeRatio > 32767)
{
  gEnv->pScriptSystem->SetGCFrequency(-1); // lets get nasty.
}
/////////////////////////////////////////////////////////////////
// Strange, !do not remove... ask Timur for the meaning of this.
/////////////////////////////////////////////////////////////////
if (m_nStrangeRatio > 1000)
{
  if (m_pProcess && (m_pProcess->GetFlags() & PROC_3DENGINE))
    m_nStrangeRatio += cry_random(1, 11);
}
/////////////////////////////////////////////////////////////////
....
// tank specific:
// avoid steering input around 0.5 (ask Anton)
....
CryWarning(VALIDATOR_MODULE_EDITOR, VALIDATOR_WARNING,
  "....: Wrong edited item. Ask AlexL to fix this.");
....
// If this renders black ask McJohn what's wrong.
glGenerateMipmap(GL_TEXTURE_2D);
....
The most important question to the developers: why don't they use specialized tools for the improvement of their code? Of course, I mean PVS-Studio. :)
I should note once again that this article provides only some of the errors we found. I didn't even get to the end of the High level warnings. So, the project is still waiting for those who may come and check it more thoroughly. Unfortunately, I cannot spend that much time, because dozens of other projects are waiting for me.

Conclusion

Having worked on the development of an analyzer, I came to the conclusion that it is just impossible to avoid errors, if the team increases or decreases in size. I am really not against Code Review, but it's not hard to count the amount of time that a team lead will have to spend reviewing the code of ten people. What about the next day? What if the number of developers is more than 10? In this case, the Code Review would only be necessary when editing key components of the product. This approach would be extremely ineffective if there is more code, and more people, in a team. The automated check of the code with the help of static analyzers will greatly help the situation. It is not a substitute for the existing tests, but a completely different approach to the code quality (by the way, static analyzers find errors in the tests too). Fixing bugs at the earliest stages of development doesn't really cost anything, unlike those that are found during the testing phase; the errors in the released product may have enormous cost.
You may download and try PVS-Studio by this link.
In case you want to discuss the licensing options, prices, and discounts, contact us at support.
Picture 3
Don't make the unicorn sad by writing bad code...

No comments:

Post a Comment