The accepted answer by Ahmed unfortunately has the PCA math wrong, leading to the a result quite different to the manuscript. Here are the images screen captured from the manuscript.
均值居中和 SVD 应沿另一个维度进行,并将通道视为不同的样本。平均居中旨在获得零的平均像素响应,而不是零的平均通道响应。
链接的算法还清楚地表明,PCA 模型的投影首先涉及将图像乘以分数,然后将该乘积乘以特征值,而不是像其他答案中那样反过来。
有关数学的更多信息,请参阅我的PCA数学答案在这里 https://stats.stackexchange.com/questions/404731/eigenvalues-as-weighting-factors-for-projection-results-on-corresponding-eigenve/405999#405999
代码中的差异可以在输出中看到。由于手稿没有提供示例输出(我发现),因此结果之间可能存在细微差别,因为手稿是捕获的屏幕截图。
For comparison, the downloaded colour file, which is a little more contrasted than the screenshot, so one would expect the same from the output greyscale.
First the result from Ahmed's code:
Then the result from the updated code:
更正后的代码(基于艾哈迈德的代码,以便于比较)是
import numpy as np
import cv2
from numpy.linalg import svd, norm
# Read input image
Ibgr = cv2.imread('path/peppers.jpg')
#Convert to YCrCb
Iycc = cv2.cvtColor(Ibgr, cv2.COLOR_BGR2YCR_CB)
# Reshape the H by W by 3 array to a 3 by N array (N = W * H)
Izycc = Iycc.reshape([-1, 3]).T
# Remove mean along Y, Cr, and Cb *separately*!
Izycc = Izycc - Izycc.mean(0) #(1)[:, np.newaxis]
# Mean across channels is required (separate means for each channel is not a
# mathematically sensible idea) - each pixel's variation should centre around 0
# Make sure we're dealing with zero-mean data here: the mean for Y, Cr, and Cb
# should separately be zero. Recall: Izycc is 3 by N array.
# Original assertion was based on a false presmise. Mean value for each pixel should be 0
assert(np.allclose(np.mean(Izycc, 0), 0.0))
# Compute data array's SVD. Ignore the 3rd return value: unimportant in this context.
(U, S, L) = svd(Izycc, full_matrices=False)
# Square the data's singular vectors to get the eigenvalues. Then, normalize
# the three eigenvalues to unit norm and finally, make a diagonal matrix out of
# them.
eigvals = np.diag(S**2 / norm(S**2))
# Eigenvectors are just the right-singular vectors.
eigvecs = U;
# Project the YCrCb data onto the principal components and reshape to W by H
# array.
# This was performed incorrectly, the published algorithm shows that the eigenvectors
# are multiplied by the flattened image then scaled by eigenvalues
Igray = np.dot(eigvecs.T, np.dot(eigvals, Izycc)).sum(0).reshape(Iycc.shape[:2])
Igray2 = np.dot(eigvals, np.dot(eigvecs, Izycc)).sum(0).reshape(Iycc.shape[:2])
eigvals3 = eigvals*[1,-1,1]
Igray3 = np.dot(eigvals3, np.dot(eigvecs, Izycc)).sum(0).reshape(Iycc.shape[:2])
eigvals4 = eigvals*[1,-1,-1]
Igray4 = np.dot(eigvals4, np.dot(eigvecs, Izycc)).sum(0).reshape(Iycc.shape[:2])
# Rescale Igray to [0, 255]. This is a fancy way to do this.
from scipy.interpolate import interp1d
Igray = np.floor((interp1d([Igray.min(), Igray.max()],
[0.0, 256.0 - 1e-4]))(Igray))
Igray2 = np.floor((interp1d([Igray2.min(), Igray2.max()],
[0.0, 256.0 - 1e-4]))(Igray2))
Igray3 = np.floor((interp1d([Igray3.min(), Igray3.max()],
[0.0, 256.0 - 1e-4]))(Igray3))
Igray4 = np.floor((interp1d([Igray4.min(), Igray4.max()],
[0.0, 256.0 - 1e-4]))(Igray4))
# Make sure we don't accidentally produce a photographic negative (flip image
# intensities). N.B.: `norm` is often expensive; in real life, try to see if
# there's a more efficient way to do this.
if norm(Iycc[:,:,0] - Igray) > norm(Iycc[:,:,0] - (255.0 - Igray)):
Igray = 255 - Igray
if norm(Iycc[:,:,0] - Igray2) > norm(Iycc[:,:,0] - (255.0 - Igray2)):
Igray2 = 255 - Igray2
if norm(Iycc[:,:,0] - Igray3) > norm(Iycc[:,:,0] - (255.0 - Igray3)):
Igray3 = 255 - Igray3
if norm(Iycc[:,:,0] - Igray4) > norm(Iycc[:,:,0] - (255.0 - Igray4)):
Igray4 = 255 - Igray4
# Display result
if True:
import pylab
pylab.ion()
fGray = pylab.imshow(Igray, cmap='gray')
# Save result
cv2.imwrite('peppers-gray.png', Igray.astype(np.uint8))
fGray2 = pylab.imshow(Igray2, cmap='gray')
# Save result
cv2.imwrite('peppers-gray2.png', Igray2.astype(np.uint8))
fGray3 =pylab.imshow(Igray3, cmap='gray')
# Save result
cv2.imwrite('peppers-gray3.png', Igray3.astype(np.uint8))
fGray4 =pylab.imshow(Igray4, cmap='gray')
# Save result
cv2.imwrite('peppers-gray4.png', Igray4.astype(np.uint8))
****编辑*****
根据 Nazlok 关于特征向量方向不稳定性的查询(任何一个特征向量的方向是任意的,因此不能保证不同的算法(或没有可重复的方向标准化步骤的单一算法)会给出相同的结果。我现在有添加了两个额外的示例,其中我只是交换了特征向量的符号(数字 2 以及数字 2 和 3)。结果又不同,仅 PC2 的切换给出了更轻的音调,而切换 2 和 3 则为类似(这并不奇怪,因为指数缩放将 PC3 的影响降到了很小)。我将把最后一个留给那些费心运行代码的人。
结论
如果没有采取明确的额外步骤来提供可重复和可再现的 PC 方向,该算法是不稳定的,我个人不愿意按原样使用它。纳兹洛克关于使用正负强度平衡的建议可以提供一条规则,但需要验证,因此超出了本答案的范围。然而,这样的规则并不能保证“最佳”解决方案,而只能保证稳定的解决方案。特征向量是单位向量,因此方差(强度的平方)是平衡的。零的哪一侧具有最大的幅值总和只是告诉我们哪一侧的各个像素贡献更大的方差,我怀疑这通常不是很有信息。