在 MATLAB 中向量化线性方程组的解

2023-12-22

Summary:本问题涉及线性回归计算算法的改进。

我有一个 3D (dlMAT）表示在不同曝光时间拍摄的同一场景的单色照片的数组（向量IT）。从数学上讲，沿第三维的每个向量dlMAT代表需要解决的单独线性回归问题。需要估计其系数的方程的形式为：

DL = R*IT^P, where DL and IT是通过实验获得的并且R and P必须估计。

上式经过对数处理后可以转化为简单的线性模型：

log(DL) = log(R) + P*log(IT)  =>  y = a + b*x

下面介绍的是求解该方程组的最“简单”方法，其本质上涉及迭代所有“第三维向量”并拟合阶多项式1 to (IT,DL(ind1,ind2,:):

%// Define some nominal values:
R = 0.3;
IT = 600:600:3000;
P = 0.97;
%// Impose some believable spatial variations:
pMAT = 0.01*randn(3)+P;
rMAT = 0.1*randn(3)+R;
%// Generate "fake" observation data:
dlMAT = bsxfun(@times,rMAT,bsxfun(@power,permute(IT,[3,1,2]),pMAT));
%// Regression:
sol = cell(size(rMAT)); %// preallocation
for ind1 = 1:size(dlMAT,1)
  for ind2 = 1:size(dlMAT,2)
    sol{ind1,ind2} = polyfit(log(IT(:)),log(squeeze(dlMAT(ind1,ind2,:))),1);
  end
end
fittedP = cellfun(@(x)x(1),sol);      %// Estimate of pMAT
fittedR = cellfun(@(x)exp(x(2)),sol); %// Estimate of rMAT

上述做法seems它是矢量化的一个很好的候选者，因为它没有利用 MATLAB 的主要优势（即 MATrix 运算）。因此，它的扩展性不太好，并且执行时间比我想象的要长得多。

存在基于矩阵除法执行此计算的替代方法，如所示here http://www.mathworks.com/matlabcentral/answers/214656-least-square-curve-fit-for-function-y-a-x-b-1-x-c and here https://www.mathworks.com/matlabcentral/newsreader/view_thread/158963#message_body_400321，其中涉及这样的事情：

sol = [ones(size(x)),log(x)]\log(y);

也就是说，附加一个向量1s 到观察结果，然后是mldivide http://www.mathworks.com/help/matlab/ref/mldivide.html求解方程组。

我面临的主要挑战是如何使我的数据适应算法（反之亦然）。

问题#1：如何扩展基于矩阵除法的解决方案来解决上述问题（并可能取代我正在使用的循环）？

问题#2（奖励）：这种基于矩阵除法的解决方案背后的原理是什么？

The secret ingredient behind the solution that includes matrix division is the Vandermonde matrix https://en.wikipedia.org/wiki/Vandermonde_matrix. The question discusses a linear problem (linear regression), and those can always be formulated as a matrix problem, which \ (mldivide) can solve in a mean-square error sense^{‡ https://chat.stackoverflow.com/transcript/message/27426179#27426179}. Such an algorithm, solving a similar problem, is demonstrated and explained in this answer https://math.stackexchange.com/a/260896.

Below is benchmarking code that compares the original solution with two alternatives suggested in chat^{1 https://chat.stackoverflow.com/transcript/message/27425670#27425670, 2 https://chat.stackoverflow.com/transcript/message/27426276#27426276} :

function regressionBenchmark(numEl)
clc
if nargin<1, numEl=10; end
%// Define some nominal values:
R = 5;
IT = 600:600:3000;
P = 0.97;
%// Impose some believable spatial variations:
pMAT = 0.01*randn(numEl)+P;
rMAT = 0.1*randn(numEl)+R;
%// Generate "fake" measurement data using the relation "DL = R*IT.^P"
dlMAT = bsxfun(@times,rMAT,bsxfun(@power,permute(IT,[3,1,2]),pMAT));
%% // Method1: loops + polyval
disp('-------------------------------Method 1: loops + polyval')
tic; [fR,fP] = method1(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
%% // Method2: loops + Vandermonde
disp('-------------------------------Method 2: loops + Vandermonde')
tic; [fR,fP] = method2(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
%% // Method3: vectorized Vandermonde
disp('-------------------------------Method 3: vectorized Vandermonde')
tic; [fR,fP] = method3(IT,dlMAT); toc;
fprintf(1,'Regression performance:\nR: %d\nP: %d\n',norm(fR-rMAT,1),norm(fP-pMAT,1));
function [fittedR,fittedP] = method1(IT,dlMAT)
sol = cell(size(dlMAT,1),size(dlMAT,2));
for ind1 = 1:size(dlMAT,1)
  for ind2 = 1:size(dlMAT,2)
    sol{ind1,ind2} = polyfit(log(IT(:)),log(squeeze(dlMAT(ind1,ind2,:))),1);
  end
end

fittedR = cellfun(@(x)exp(x(2)),sol);
fittedP = cellfun(@(x)x(1),sol);

function [fittedR,fittedP] = method2(IT,dlMAT)
sol = cell(size(dlMAT,1),size(dlMAT,2));
for ind1 = 1:size(dlMAT,1)
  for ind2 = 1:size(dlMAT,2)
    sol{ind1,ind2} = flipud([ones(numel(IT),1) log(IT(:))]\log(squeeze(dlMAT(ind1,ind2,:)))).'; %'
  end
end

fittedR = cellfun(@(x)exp(x(2)),sol);
fittedP = cellfun(@(x)x(1),sol);

function [fittedR,fittedP] = method3(IT,dlMAT)
N = 1; %// Degree of polynomial
VM = bsxfun(@power, log(IT(:)), 0:N); %// Vandermonde matrix
result = fliplr((VM\log(reshape(dlMAT,[],size(dlMAT,3)).')).');
%// Compressed version:
%// result = fliplr(([ones(numel(IT),1) log(IT(:))]\log(reshape(dlMAT,[],size(dlMAT,3)).')).');

fittedR = exp(real(reshape(result(:,2),size(dlMAT,1),size(dlMAT,2))));
fittedP = real(reshape(result(:,1),size(dlMAT,1),size(dlMAT,2)));

方法2之所以能够向量化为方法3，本质上是因为矩阵乘法可以通过第二个矩阵的列来分隔。如果A*B产生矩阵X，那么根据定义A*B(:,n) gives X(:,n)对于任何n。移动A到右侧mldivide，这意味着划分A\X(:,n)可以一劳永逸n with A\X。这同样适用于超定系统 https://en.wikipedia.org/wiki/Overdetermined_system（线性回归问题），其中一般没有精确解，并且mldivide找到矩阵最小化均方误差 https://en.wikipedia.org/wiki/Overdetermined_system#Approximate_solutions。在这种情况下，操作也A\X(:,n)(方法2)可以一次性全部完成n with A\X（方法3）。

增加大小时改进算法的影响dlMAT如下所示：

对于以下情况500*500 (or 2.5E5) 元素，加速比Method 1 to Method 3是关于x3500!

观察也很有趣output http://www.mathworks.com/help/matlab/matlab_prog/profiling-for-improving-performance.html#f9-17206 of profile http://www.mathworks.com/help/matlab/ref/profile.html（这里以500*500的情况为例）：

Method 1

Method 2

Method 3

从上面可以看出，通过重新排列元素squeeze http://www.mathworks.com/help/matlab/ref/squeeze.html and flipud http://www.mathworks.com/help/matlab/ref/flipud.html占用大约一半（！）的运行时间Method 2。还可以看出，溶液从细胞到基质的转换会损失一些时间。

由于第三个解决方案避免了所有这些陷阱，以及完全的循环（这主要意味着在每次迭代时重新评估脚本） - 毫不奇怪，它会带来相当大的加速。

Notes:

“压缩”版本和“显式”版本之间几乎没有什么区别Method 3赞成“明确”的版本。因此，它没有被纳入比较。
尝试了一种解决方案，其中输入Method 3 were gpuArray-ed。这并没有提供改进的性能（甚至在某种程度上降低了性能），可能是由于错误的实现，或者与在 RAM 和 VRAM 之间来回复制矩阵相关的开销。

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)