我在我为 64 位端口准备的一些 (C++) 代码中发现了与此类似的片段。
int n;
size_t pos, npos;
/* ... initialization ... */
while((pos = find(ch, start)) != npos)
{
/* ... advance start position ... */
n++; // this will overflow if the loop iterates too many times
}
虽然我严重怀疑这实际上会在内存密集型应用程序中引起问题,但从理论的角度来看还是值得考虑的,因为类似的错误可能会出现will造成问题。 (改变n
to a short
在上面的示例中,即使是小文件也可能溢出计数器。)
Static analysis tools are useful, but they can't detect this kind of error directly. (Not yet, anyway.) The counter n
doesn't participate in the while
expression at all, so this isn't as simple as other loops (where typecasting errors give the error away). Any tool would need to determine that the loop would execute more than 231 times, but that means it needs to be able to estimate how many times the expression (pos = find(ch, start)) != npos
will evaluate as true—no small feat! Even if a tool could determine that the loop could execute more than 231 times (say, because it recognizes the find
function is working on a string), how could it know that the loop won't execute more than 264 times, overflowing a size_t
value, too?
显然,要最终识别和修复此类错误需要人眼,但是是否有模式可以泄露此类错误以便可以手动检查?存在哪些类似的错误需要我注意?
EDIT 1: Since short
, int
and long
类型本质上是有问题的,可以通过检查这些类型的每个实例来发现这种错误。然而,鉴于它们在遗留 C++ 代码中无处不在,我不确定这对于大型软件是否实用。还有什么会导致这个错误?是每个while
循环可能会出现类似这样的某种错误? (for
循环当然不能幸免!)如果我们不处理像这样的 16 位类型,这种错误有多严重short
?
EDIT 2:这是另一个示例,显示此错误如何出现在for
loop.
int i = 0;
for (iter = c.begin(); iter != c.end(); iter++, i++)
{
/* ... */
}
从根本上讲,这是相同的问题:循环依赖于某些从不直接与更广泛的类型交互的变量。该变量仍然可能溢出,但没有编译器或工具检测到转换错误。 (严格来说,没有。)
EDIT 3:我正在使用的代码是very大的。 (仅 C++ 就有 10-1500 万行代码。)检查所有这些代码是不可行的,因此我对自动识别此类问题(即使它会导致很高的误报率)的方法特别感兴趣。