我正在尝试优化一个占用程序大量计算时间的循环。
但是,当我使用 -O3 -ffast-math -ftree-vectorizer-verbose=6 GCC 输出打开自动矢量化时,它无法对循环进行矢量化。
我正在使用海湾合作委员会4.4.5
代码:
/// Find the point in the path with the largest v parameter
void prediction::find_knife_edge(
const float * __restrict__ const elevation_path,
float * __restrict__ const diff_path,
const float path_res,
const unsigned a,
const unsigned b,
const float h_a,
const float h_b,
const float f,
const float r_e,
) const
{
float wavelength = (speed_of_light * 1e-6f) / f;
float d_ab = path_res * static_cast<float>(b - a);
for (unsigned n = a + 1; n <= b - 1; n++)
{
float d_an = path_res * static_cast<float>(n - a);
float d_nb = path_res * static_cast<float>(b - n);
float h = elevation_path[n] + (d_an * d_nb) / (2.0f * r_e) - (h_a * d_nb + h_b * d_an) / d_ab;
float v = h * std::sqrt((2.0f * d_ab) / (wavelength * d_an * d_nb));
diff_path[n] = v;
}
}
来自海湾合作委员会的消息:
note: not vectorized: number of iterations cannot be computed.
note: not vectorized: unhandled data-ref
在有关自动矢量化的页面上(http://gcc.gnu.org/projects/tree-ssa/vectorization.html http://gcc.gnu.org/projects/tree-ssa/vectorization.html)它声明它支持未知的循环边界。
如果我将 for 替换为
for (unsigned n = 0; n <= 100; n++)
然后对其进行矢量化。
我究竟做错了什么?
缺乏关于这些消息的确切含义以及 GCC 自动矢量化的细节的详细文档是相当烦人的。
EDIT:
感谢大卫,我将循环更改为:
for (unsigned n = a + 1; n < b; n++)
现在,GCC 尝试对循环进行矢量化,但抛出此错误:
note: not vectorized: unhandled data-ref
note: Alignment of access forced using peeling.
note: Vectorizing an unaligned access.
note: vect_model_induction_cost: inside_cost = 1, outside_cost = 2 .
note: not vectorized: relevant stmt not supported: D.76777_65 = (float) n_34;
“D.76777_65 = (float) n_34;”是什么意思意思是?