The most beautiful bug I found with PVS-Studio in 2024

The most beautiful bug I found with PVS-Studio in 2024

I warn you right away, my tastes are very specific. The beauty of a mistake is that it is very difficult for a person to find it. I don’t believe it can be seen during code review. If you know in advance that it exists, and look for it purposefully.

I found errors in the DPDK project. It has other errors, but more about them later. They pale in comparison to this diamond. Just don’t expect anything like that. The mistake is ridiculously simple. That’s how difficult it is to find it by looking at the code. Actually, try it yourself.

To do this, first study the list of named constants:

enum dbg_status

enum dbg_status {
  DBG_STATUS_OK,
  DBG_STATUS_APP_VERSION_NOT_SET,
  DBG_STATUS_UNSUPPORTED_APP_VERSION,
  DBG_STATUS_DBG_BLOCK_NOT_RESET,
  DBG_STATUS_INVALID_ARGS,
  DBG_STATUS_OUTPUT_ALREADY_SET,
  DBG_STATUS_INVALID_PCI_BUF_SIZE,
  DBG_STATUS_PCI_BUF_ALLOC_FAILED,
  DBG_STATUS_PCI_BUF_NOT_ALLOCATED,
  DBG_STATUS_INVALID_FILTER_TRIGGER_DWORDS,
  DBG_STATUS_NO_MATCHING_FRAMING_MODE,
  DBG_STATUS_VFC_READ_ERROR,
  DBG_STATUS_STORM_ALREADY_ENABLED,
  DBG_STATUS_STORM_NOT_ENABLED,
  DBG_STATUS_BLOCK_ALREADY_ENABLED,
  DBG_STATUS_BLOCK_NOT_ENABLED,
  DBG_STATUS_NO_INPUT_ENABLED,
  DBG_STATUS_NO_FILTER_TRIGGER_256B,
  DBG_STATUS_FILTER_ALREADY_ENABLED,
  DBG_STATUS_TRIGGER_ALREADY_ENABLED,
  DBG_STATUS_TRIGGER_NOT_ENABLED,
  DBG_STATUS_CANT_ADD_CONSTRAINT,
  DBG_STATUS_TOO_MANY_TRIGGER_STATES,
  DBG_STATUS_TOO_MANY_CONSTRAINTS,
  DBG_STATUS_RECORDING_NOT_STARTED,
  DBG_STATUS_DATA_DIDNT_TRIGGER,
  DBG_STATUS_NO_DATA_RECORDED,
  DBG_STATUS_DUMP_BUF_TOO_SMALL,
  DBG_STATUS_DUMP_NOT_CHUNK_ALIGNED,
  DBG_STATUS_UNKNOWN_CHIP,
  DBG_STATUS_VIRT_MEM_ALLOC_FAILED,
  DBG_STATUS_BLOCK_IN_RESET,
  DBG_STATUS_INVALID_TRACE_SIGNATURE,
  DBG_STATUS_INVALID_NVRAM_BUNDLE,
  DBG_STATUS_NVRAM_GET_IMAGE_FAILED,
  DBG_STATUS_NON_ALIGNED_NVRAM_IMAGE,
  DBG_STATUS_NVRAM_READ_FAILED,
  DBG_STATUS_IDLE_CHK_PARSE_FAILED,
  DBG_STATUS_MCP_TRACE_BAD_DATA,
  DBG_STATUS_MCP_TRACE_NO_META,
  DBG_STATUS_MCP_COULD_NOT_HALT,
  DBG_STATUS_MCP_COULD_NOT_RESUME,
  DBG_STATUS_RESERVED0,
  DBG_STATUS_SEMI_FIFO_NOT_EMPTY,
  DBG_STATUS_IGU_FIFO_BAD_DATA,
  DBG_STATUS_MCP_COULD_NOT_MASK_PRTY,
  DBG_STATUS_FW_ASSERTS_PARSE_FAILED,
  DBG_STATUS_REG_FIFO_BAD_DATA,
  DBG_STATUS_PROTECTION_OVERRIDE_BAD_DATA,
  DBG_STATUS_DBG_ARRAY_NOT_SET,
  DBG_STATUS_RESERVED1,
  DBG_STATUS_NON_MATCHING_LINES,
  DBG_STATUS_INSUFFICIENT_HW_IDS,
  DBG_STATUS_DBG_BUS_IN_USE,
  DBG_STATUS_INVALID_STORM_DBG_MODE,
  DBG_STATUS_OTHER_ENGINE_BB_ONLY,
  DBG_STATUS_FILTER_SINGLE_HW_ID,
  DBG_STATUS_TRIGGER_SINGLE_HW_ID,
  DBG_STATUS_MISSING_TRIGGER_STATE_STORM,
  MAX_DBG_STATUS
};

Now examine the string array:

static const char * const s_status_str[] = ….

static const char * const s_status_str[] = {
  /* DBG_STATUS_OK */
  "Operation completed successfully",

  /* DBG_STATUS_APP_VERSION_NOT_SET */
  "Debug application version wasn't set",

  /* DBG_STATUS_UNSUPPORTED_APP_VERSION */
  "Unsupported debug application version",

  /* DBG_STATUS_DBG_BLOCK_NOT_RESET */
  "The debug block wasn't reset since the last recording",

  /* DBG_STATUS_INVALID_ARGS */
  "Invalid arguments",

  /* DBG_STATUS_OUTPUT_ALREADY_SET */
  "The debug output was already set",

  /* DBG_STATUS_INVALID_PCI_BUF_SIZE */
  "Invalid PCI buffer size",

  /* DBG_STATUS_PCI_BUF_ALLOC_FAILED */
  "PCI buffer allocation failed",

  /* DBG_STATUS_PCI_BUF_NOT_ALLOCATED */
  "A PCI buffer wasn't allocated",

  /* DBG_STATUS_INVALID_FILTER_TRIGGER_DWORDS */
  "The filter/trigger constraint dword offsets are not "
  "enabled for recording",

  /* DBG_STATUS_VFC_READ_ERROR */
  "Error reading from VFC",

  /* DBG_STATUS_STORM_ALREADY_ENABLED */
  "The Storm was already enabled",

  /* DBG_STATUS_STORM_NOT_ENABLED */
  "The specified Storm wasn't enabled",

  /* DBG_STATUS_BLOCK_ALREADY_ENABLED */
  "The block was already enabled",

  /* DBG_STATUS_BLOCK_NOT_ENABLED */
  "The specified block wasn't enabled",

  /* DBG_STATUS_NO_INPUT_ENABLED */
  "No input was enabled for recording",

  /* DBG_STATUS_NO_FILTER_TRIGGER_256B */
  "Filters and triggers are not allowed in E4 256-bit mode",

  /* DBG_STATUS_FILTER_ALREADY_ENABLED */
  "The filter was already enabled",

  /* DBG_STATUS_TRIGGER_ALREADY_ENABLED */
  "The trigger was already enabled",

  /* DBG_STATUS_TRIGGER_NOT_ENABLED */
  "The trigger wasn't enabled",

  /* DBG_STATUS_CANT_ADD_CONSTRAINT */
  "A constraint can be added only after a filter was "
  "enabled or a trigger state was added",

  /* DBG_STATUS_TOO_MANY_TRIGGER_STATES */
  "Cannot add more than 3 trigger states",

  /* DBG_STATUS_TOO_MANY_CONSTRAINTS */
  "Cannot add more than 4 constraints per filter or trigger state",

  /* DBG_STATUS_RECORDING_NOT_STARTED */
  "The recording wasn't started",

  /* DBG_STATUS_DATA_DID_NOT_TRIGGER */
  "A trigger was configured, but it didn't trigger",

  /* DBG_STATUS_NO_DATA_RECORDED */
  "No data was recorded",

  /* DBG_STATUS_DUMP_BUF_TOO_SMALL */
  "Dump buffer is too small",

  /* DBG_STATUS_DUMP_NOT_CHUNK_ALIGNED */
  "Dumped data is not aligned to chunks",

  /* DBG_STATUS_UNKNOWN_CHIP */
  "Unknown chip",

  /* DBG_STATUS_VIRT_MEM_ALLOC_FAILED */
  "Failed allocating virtual memory",

  /* DBG_STATUS_BLOCK_IN_RESET */
  "The input block is in reset",

  /* DBG_STATUS_INVALID_TRACE_SIGNATURE */
  "Invalid MCP trace signature found in NVRAM",

  /* DBG_STATUS_INVALID_NVRAM_BUNDLE */
  "Invalid bundle ID found in NVRAM",

  /* DBG_STATUS_NVRAM_GET_IMAGE_FAILED */
  "Failed getting NVRAM image",

  /* DBG_STATUS_NON_ALIGNED_NVRAM_IMAGE */
  "NVRAM image is not dword-aligned",

  /* DBG_STATUS_NVRAM_READ_FAILED */
  "Failed reading from NVRAM",

  /* DBG_STATUS_IDLE_CHK_PARSE_FAILED */
  "Idle check parsing failed",

  /* DBG_STATUS_MCP_TRACE_BAD_DATA */
  "MCP Trace data is corrupt",

  /* DBG_STATUS_MCP_TRACE_NO_META */
  "Dump doesn't contain meta data - it must be provided in image file",

  /* DBG_STATUS_MCP_COULD_NOT_HALT */
  "Failed to halt MCP",

  /* DBG_STATUS_MCP_COULD_NOT_RESUME */
  "Failed to resume MCP after halt",

  /* DBG_STATUS_RESERVED0 */
  "",

  /* DBG_STATUS_SEMI_FIFO_NOT_EMPTY */
  "Failed to empty SEMI sync FIFO",

  /* DBG_STATUS_IGU_FIFO_BAD_DATA */
  "IGU FIFO data is corrupt",

  /* DBG_STATUS_MCP_COULD_NOT_MASK_PRTY */
  "MCP failed to mask parities",

  /* DBG_STATUS_FW_ASSERTS_PARSE_FAILED */
  "FW Asserts parsing failed",

  /* DBG_STATUS_REG_FIFO_BAD_DATA */
  "GRC FIFO data is corrupt",

  /* DBG_STATUS_PROTECTION_OVERRIDE_BAD_DATA */
  "Protection Override data is corrupt",

  /* DBG_STATUS_DBG_ARRAY_NOT_SET */
  "Debug arrays were not set "
  "(when using binary files, dbg_set_bin_ptr must be called)",

  /* DBG_STATUS_RESERVED1 */
  "",

  /* DBG_STATUS_NON_MATCHING_LINES */
  "Non-matching debug lines - in E4, all lines must be of "
  "the same type (either 128b or 256b)",

  /* DBG_STATUS_INSUFFICIENT_HW_IDS */
  "Insufficient HW IDs. Try to record less Storms/blocks",

  /* DBG_STATUS_DBG_BUS_IN_USE */
  "The debug bus is in use",

  /* DBG_STATUS_INVALID_STORM_DBG_MODE */
  "The storm debug mode is not supported in the current chip",

  /* DBG_STATUS_OTHER_ENGINE_BB_ONLY */
  "Other engine is supported only in BB",

  /* DBG_STATUS_FILTER_SINGLE_HW_ID */
  "The configured filter mode requires a single Storm/block input",

  /* DBG_STATUS_TRIGGER_SINGLE_HW_ID */
  "The configured filter mode requires that all the constraints of a "
  "single trigger state will be defined on a single Storm/block input",

  /* DBG_STATUS_MISSING_TRIGGER_STATE_STORM */
  "When triggering on Storm data, the Storm to trigger on must be specified"
};

The last is the code that converts a named constant to a string:

const char *qed_dbg_get_status_str(enum dbg_status status)
{
  return (status 

Where is the error?

Did you find it?

In fact, there is a hint that you can theoretically latch on to. In an array, all lines are separated by one blank line (see the first comment to the article), except for one space. But let’s start with a warning from the PVS-Studio code analyzer:

V557 Array overrun is possible. The value of ‘status’ index could reach 58. qede_debug.c 7149

It points to the place where the row pointer is removed from the array by index:

return (status 

At first I made the wrong boring diagnosis. Due to inattention, I thought that I had an off-by-one error. This is very similar to a typical incorrect validation that I have encountered many times in various projects.

The essence of the classical antipattern. THERE ARE enumthe last element of which is used as the number of elements in it.

enum Efoo {
  A,
  B,
  C,
  COUNT
};

Constants in enum are assigned values ​​starting from 0. Accordingly, the auxiliary constant COUNT will be equal to 3, which corresponds to the number of main named constants.

There is an array where something corresponds to each named constant:

char Cfoo[] = { 'A', 'B', 'C' };

A bug is often allowed when checking that the index does not go beyond the array boundary:

char Convert(unsigned id)
{
  return (id > COUNT) ? 0 : Cfoo[id];
}

Validation does not work properly: if id will be equal COUNTThen there will be an exit to the border of the array. For such errors, the PVS-Studio analyzer issues a warning similar to the one discussed earlier.

One of the following options will be correct:

return (id >= COUNT) ? 0 : Cfoo[id];  // OK
return (id 

Nothing interesting. We part ways. In the article, you can write only a warning along with other errors of going abroad, but the bug itself cannot be analyzed in the article about checking DPDK.

And here at the last moment – STOP!

THE CORRECT CHECK IS WRITTEN HERE TOO!

return (status 

Now it is unclear and interesting! Why did the analyzer issue a warning? Misfire? It is unlikely, there is nowhere for him to get confused.

Rolling up my sleeves, I begin to study the code carefully. Here it is! Missing line for constant DBG_STATUS_NO_MATCHING_FRAMING_MODE.

In enum:

DBG_STATUS_INVALID_FILTER_TRIGGER_DWORDS,
DBG_STATUS_NO_MATCHING_FRAMING_MODE,
DBG_STATUS_VFC_READ_ERROR,

In the array:

/* DBG_STATUS_INVALID_FILTER_TRIGGER_DWORDS */
"The filter/trigger constraint dword offsets are not "
"enabled for recording",

/* DBG_STATUS_VFC_READ_ERROR */
"Error reading from VFC",

Note the delimiter consisting of two empty lines (see the first comment on the article). I mentioned this artifact before. Something went wrong here. Maybe there was some kind of failure while populating the array, maybe a row was accidentally deleted, maybe a bad branch merge, or something else.

However, now that the bug is found, the two empty lines look suspicious. No one will pay attention to them during the inspection. And even accidentally noticing that they will simply delete one of them for beauty. No one will say, “we must have missed something here, let’s go and check” 🙂

As a result of this error:

  • possible exit to the border of the massif;
  • function returns invalid strings for all constants starting with DBG_STATUS_NO_MATCHING_FRAMING_MODE.

It’s beautiful in my opinion. Thank you for your attention. Check your code regularly with PVS-Studio.

PS

A colleague suggested adding to the article that this kind of error could be avoided by using compile-time checking. Code examples from static_assert for C and C++. C++ is smarter with calculating array sizes, but that’s another story. Apparently, such checks should be added everywhere. But for such large arrays, why not.

If you would like to share this article with an English-speaking audience, please use the translation link: Andrey Karpov. Most striking error I found with PVS-Studio in 2024.

Related posts