如何在opencv中仅提取nxn图像的3个特征向量?

2024-02-24

我正在尝试使用以下方法将 RGB 图像转换为灰度图像paper https://www.semanticscholar.org/paper/Novel-PCA-based-color-to-gray-image-conversion-Seo-Kim/7c23493d4cbe69c6eed697f6394d0f31c660e6ca/pdf.

论文中使用的主要算法是这样的:基于 PCA 的新颖算法将图像转换为灰度 https://i.stack.imgur.com/dTJIR.png

然而,当我尝试从图像中提取特征向量时,我得到了 500 个特征值,而不是所需的 3 个。据我所知,一个 NxN 矩阵通常给出 N 个特征向量,但我不太确定我应该在这里做什么才能只得到 3 个特征向量。

我应该做什么有什么帮助吗?到目前为止,这是我的代码:

import numpy as np
import cv2

def pca_rgb2gray(img):
    """
    NOVEL PCA-BASED COLOR-TO-GRAY IMAGE CONVERSION
    Authors:
    -Ja-Won Seo
    -Seong Dae Kim
    2013 IEEE International Conference on Image Processing
    """
    I_re = cv2.resize(img, (500,500))
    Iycc = cv2.cvtColor(I_re, cv2.COLOR_BGR2YCrCb)
    Izycc = Iycc - Iycc.mean()
    eigvals = []
    eigvecs = []
    final_im = []
    for i in range(3):
        res = np.linalg.eig(Izycc[:,:,i])
        eigvals.append(res[0])
        eigvecs.append(res[1])
    eignorm = np.linalg.norm(eigvals)
    for i in range(3):
        eigvals[i]/=eignorm
        eigvecs[i]/=np.linalg.norm(eigvecs[i])
        temp = eigvals[i] * np.dot(eigvecs[i], Izycc[:,:,i])
        final_im.append(temp)
    final_im = final_im[0] + final_im[1] + final_im[2]
    return final_im
if __name__ == '__main__':
    img = cv2.imread('image.png')
    gray = pca_rgb2gray(img)

The accepted answer by Ahmed unfortunately has the PCA math wrong, leading to the a result quite different to the manuscript. Here are the images screen captured from the manuscript. original images in manuscript

均值居中和 SVD 应沿另一个维度进行,并将通道视为不同的样本。平均居中旨在获得零的平均像素响应,而不是零的平均通道响应。

链接的算法还清楚地表明,PCA 模型的投影首先涉及将图像乘以分数,然后将该乘积乘以特征值,而不是像其他答案中那样反过来。

有关数学的更多信息,请参阅我的PCA数学答案在这里 https://stats.stackexchange.com/questions/404731/eigenvalues-as-weighting-factors-for-projection-results-on-corresponding-eigenve/405999#405999

代码中的差异可以在输出中看到。由于手稿没有提供示例输出(我发现),因此结果之间可能存在细微差别,因为手稿是捕获的屏幕截图。

For comparison, the downloaded colour file, which is a little more contrasted than the screenshot, so one would expect the same from the output greyscale. downloaded colour file

First the result from Ahmed's code: Ahmeds result

Then the result from the updated code: new result

更正后的代码(基于艾哈迈德的代码,以便于比较)是

import numpy as np
import cv2
from numpy.linalg import svd, norm

# Read input image
Ibgr = cv2.imread('path/peppers.jpg')
#Convert to YCrCb
Iycc = cv2.cvtColor(Ibgr, cv2.COLOR_BGR2YCR_CB)

# Reshape the H by W by 3 array to a 3 by N array (N = W * H)
Izycc = Iycc.reshape([-1, 3]).T

# Remove mean along Y, Cr, and Cb *separately*!
Izycc = Izycc - Izycc.mean(0)  #(1)[:, np.newaxis]
# Mean across channels is required (separate means for each channel is not a 
# mathematically sensible idea) - each pixel's variation should centre around 0

# Make sure we're dealing with zero-mean data here: the mean for Y, Cr, and Cb
# should separately be zero. Recall: Izycc is 3 by N array. 
# Original assertion was based on a false presmise. Mean value for each pixel should be 0
assert(np.allclose(np.mean(Izycc, 0), 0.0))

# Compute data array's SVD. Ignore the 3rd return value: unimportant in this context. 
(U, S, L) = svd(Izycc, full_matrices=False)

# Square the data's singular vectors to get the eigenvalues. Then, normalize
# the three eigenvalues to unit norm and finally, make a diagonal matrix out of
# them. 
eigvals = np.diag(S**2 / norm(S**2))

# Eigenvectors are just the right-singular vectors.
eigvecs = U;

# Project the YCrCb data onto the principal components and reshape to W by H
# array.
# This was performed incorrectly, the published algorithm shows that the eigenvectors 
# are multiplied by the flattened image then scaled by eigenvalues
Igray = np.dot(eigvecs.T, np.dot(eigvals, Izycc)).sum(0).reshape(Iycc.shape[:2])
Igray2 = np.dot(eigvals, np.dot(eigvecs, Izycc)).sum(0).reshape(Iycc.shape[:2])
eigvals3 = eigvals*[1,-1,1]
Igray3 = np.dot(eigvals3, np.dot(eigvecs, Izycc)).sum(0).reshape(Iycc.shape[:2])
eigvals4 = eigvals*[1,-1,-1]
Igray4 = np.dot(eigvals4, np.dot(eigvecs, Izycc)).sum(0).reshape(Iycc.shape[:2])

# Rescale Igray to [0, 255]. This is a fancy way to do this.
from scipy.interpolate import interp1d
Igray = np.floor((interp1d([Igray.min(), Igray.max()],
                            [0.0, 256.0 - 1e-4]))(Igray))
Igray2 = np.floor((interp1d([Igray2.min(), Igray2.max()],
                            [0.0, 256.0 - 1e-4]))(Igray2))
Igray3 = np.floor((interp1d([Igray3.min(), Igray3.max()],
                            [0.0, 256.0 - 1e-4]))(Igray3))
Igray4 = np.floor((interp1d([Igray4.min(), Igray4.max()],
                            [0.0, 256.0 - 1e-4]))(Igray4))

# Make sure we don't accidentally produce a photographic negative (flip image
# intensities). N.B.: `norm` is often expensive; in real life, try to see if
# there's a more efficient way to do this.
if norm(Iycc[:,:,0] - Igray) > norm(Iycc[:,:,0] - (255.0 - Igray)):
    Igray = 255 - Igray
if norm(Iycc[:,:,0] - Igray2) > norm(Iycc[:,:,0] - (255.0 - Igray2)):
    Igray2 = 255 - Igray2
if norm(Iycc[:,:,0] - Igray3) > norm(Iycc[:,:,0] - (255.0 - Igray3)):
    Igray3 = 255 - Igray3
if norm(Iycc[:,:,0] - Igray4) > norm(Iycc[:,:,0] - (255.0 - Igray4)):
    Igray4 = 255 - Igray4

# Display result
if True:
    import pylab
    pylab.ion()
fGray =    pylab.imshow(Igray, cmap='gray')
# Save result
cv2.imwrite('peppers-gray.png', Igray.astype(np.uint8))

fGray2 = pylab.imshow(Igray2, cmap='gray')
# Save result
cv2.imwrite('peppers-gray2.png', Igray2.astype(np.uint8))

fGray3 =pylab.imshow(Igray3, cmap='gray')
# Save result
cv2.imwrite('peppers-gray3.png', Igray3.astype(np.uint8))

fGray4 =pylab.imshow(Igray4, cmap='gray')
# Save result
cv2.imwrite('peppers-gray4.png', Igray4.astype(np.uint8))

****编辑*****

根据 Nazlok 关于特征向量方向不稳定性的查询(任何一个特征向量的方向是任意的,因此不能保证不同的算法(或没有可重复的方向标准化步骤的单一算法)会给出相同的结果。我现在有添加了两个额外的示例,其中我只是交换了特征向量的符号(数字 2 以及数字 2 和 3)。结果又不同,仅 PC2 的切换给出了更轻的音调,而切换 2 和 3 则为类似(这并不奇怪,因为指数缩放将 PC3 的影响降到了很小)。我将把最后一个留给那些费心运行代码的人。

结论

如果没有采取明确的额外步骤来提供可重复和可再现的 PC 方向,该算法是不稳定的,我个人不愿意按原样使用它。纳兹洛克关于使用正负强度平衡的建议可以提供一条规则,但需要验证,因此超出了本答案的范围。然而,这样的规则并不能保证“最佳”解决方案,而只能保证稳定的解决方案。特征向量是单位向量,因此方差(强度的平方)是平衡的。零的哪一侧具有最大的幅值总和只是告诉我们哪一侧的各个像素贡献更大的方差,我怀疑这通常不是很有信息。

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

如何在opencv中仅提取nxn图像的3个特征向量? 的相关文章

随机推荐