在对某些代码进行基准测试时,我发现即使是最无害的代码更改,其执行时间也会有所不同。
我试图将下面的代码简化为最小的测试用例,但它仍然相当冗长(对此我深表歉意)。几乎任何改变都会很大程度上影响基准测试结果。
#include <string>
#include <vector>
#include <iostream>
#include <random>
#include <chrono>
#include <functional>
constexpr double usec_to_sec = 1000000.0;
// Simple convenience timer
class Timer
{
std::chrono::high_resolution_clock::time_point start_time;
public:
Timer() : start_time(std::chrono::high_resolution_clock::now()) { }
int64_t operator()() const {
return static_cast<int64_t>(
std::chrono::duration_cast<std::chrono::microseconds>(
std::chrono::high_resolution_clock::now()-start_time).count()
);
}
};
// Convenience random number generator
template <typename T>
class RandGen
{
mutable std::default_random_engine generator;
std::uniform_int_distribution<T> distribution;
constexpr unsigned make_seed() const {
return static_cast<unsigned>(std::chrono::system_clock::now().time_since_epoch().count());
}
public:
RandGen(T min, T max) : generator(make_seed()), distribution(min, max) { }
T operator ()() { return distribution(generator); }
};
// Printer class
class Printer
{
std::string filename;
template <class S>
friend Printer &operator<<(Printer &, S &&s);
public:
Printer(const char *filename) : filename(filename) {}
};
template <class S>
Printer &operator<<(Printer &pm, S &&s) {
std::cout << s;
return pm;
}
// +------------+
// | Main Stuff |
// +------------+
void runtest(size_t run_length)
{
static RandGen<size_t> word_sz_generator(10, 20);
static RandGen<int> rand_char_generator(0, 25);
size_t total_char_count = 0;
std::vector<std::string> word_list;
word_list.reserve(run_length);
Printer printer("benchmark.dat");
printer << "Running test... ";
Timer timer; // start timer
for (auto i = 0; i < run_length; i++) {
size_t word_sz = word_sz_generator();
std::string word;
for (auto sz = 0; sz < word_sz; sz++) {
word.push_back(static_cast<char>(rand_char_generator())+'a');
}
word_list.emplace_back(std::move(word));
total_char_count += word_sz;
}
int64_t execution_time_usec = timer(); // stop timer
printer << /*run_length*/ word_list.size() << " words, and "
<< total_char_count << " total characters, were built in "
<< execution_time_usec/usec_to_sec << " seconds.\n";
}
int main(int argc, char **argv)
{
constexpr size_t iterations = 30;
constexpr size_t run_length = 50000000;
for (auto i = 0; i < iterations; i++)
runtest(run_length);
return EXIT_SUCCESS;
}
The 1st class, Timer
, is just a small convenience class (intentionally not well-featured, for brevity) for timing the code.
I tried to do without the 2nd class RandGen
(which just generates random values), but any attempt to exclude this from the test code made the problem auto-magically disappear. So, I suspect the issue has something to do with it. But I can't figure out how.
The 3rd class Printer
seems entirely unnecessary for this question, but again, including it seems to exacerbate the issue.
所以,现在我们要做的是main()
(它只是运行测试)和runtest()
.
runtest()
很丑陋,所以请不要从“干净代码”的角度来看它。以任何方式改变它(例如移动内部for loop
其自身功能)导致基准结果发生变化。最简单、最令人困惑的例子是最后一行:
printer << /*run_length*/ word_list.size() << " words, and "
<< total_char_count << " total characters, were built in "
<< execution_time_usec/usec_to_sec << " seconds.\n";
在上面的行中,run_length
and word_list.size()
是相同的。向量的大小word_list
定义为run_length
。但是,如果我按原样运行代码,我得到的平均执行时间为9.8秒,而如果我取消注释run_length
并注释掉word_list.size()
,实际执行时间增加平均为10.6秒。我无法理解如此微不足道的代码更改如何会影响整个程序的时间到如此程度。
换句话说...
9.8秒:
printer << /*run_length*/ word_list.size() << " words, and "
<< total_char_count << " total characters, were built in "
<< execution_time_usec/usec_to_sec << " seconds.\n";
10.6秒:
printer << run_length /*word_list.size()*/ << " words, and "
<< total_char_count << " total characters, were built in "
<< execution_time_usec/usec_to_sec << " seconds.\n";
我多次重复了对上述变量进行注释和取消注释以及重新运行基准测试的练习。基准测试是可重复且一致的 - 即它们分别一致为 9.8 秒和 10.6 秒。
对于两种情况,代码输出如下所示:
Running test... 50000000 words, and 750000798 total characters, were built in 9.83379 seconds.
Running test... 50000000 words, and 749978210 total characters, were built in 9.84541 seconds.
Running test... 50000000 words, and 749996688 total characters, were built in 9.87418 seconds.
Running test... 50000000 words, and 749995415 total characters, were built in 9.85704 seconds.
Running test... 50000000 words, and 750017699 total characters, were built in 9.86186 seconds.
Running test... 50000000 words, and 749998680 total characters, were built in 9.83395 seconds.
...
Running test... 50000000 words, and 749988517 total characters, were built in 10.604 seconds.
Running test... 50000000 words, and 749958011 total characters, were built in 10.6283 seconds.
Running test... 50000000 words, and 749994387 total characters, were built in 10.6374 seconds.
Running test... 50000000 words, and 749995242 total characters, were built in 10.6445 seconds.
Running test... 50000000 words, and 749988379 total characters, were built in 10.6543 seconds.
Running test... 50000000 words, and 749969532 total characters, were built in 10.6722 seconds.
...
任何有关导致这种差异的原因的信息将不胜感激。
Notes:
- 甚至删除未使用的
std::string filename
成员对象来自Printer
类产生不同的基准结果 - 这样做可以消除(或减少到微不足道的比例)上面提供的两个基准之间的差异。
- 当使用 g++(在 Ubuntu 上)编译时,这似乎不是问题。虽然,我不能明确地说;我对 Ubuntu 的测试是在同一台 Windows 计算机上的虚拟机中进行的,该虚拟机可能无法访问所有资源和处理器增强功能。
- I am using Visual Studio Community 2017 (version 15.7.4)
- 编译器版本:19.14.26431
- 所有测试和报告的结果均发布版本, 64-bit
- 系统:Win10、i7-6700K @ 4.00 GHz、32 GB RAM