在机器学习领域,你想做的事情叫做分类 https://en.wikipedia.org/wiki/Statistical_classification,其中的目标是将其中一个类别(颜色)的标签分配给每个观察结果(图像)。
为此,必须预先定义类。假设这些是我们要分配给图像的颜色:
要确定图像的主色,必须计算其每个像素与表中所有颜色之间的距离。请注意,此距离是在 RGB 颜色空间中计算的。要计算图像的第 ij 个像素与表的第 k 个颜色之间的距离,可以使用以下等式:
d_ijk = sqrt((r_ij-r_k)^2+(g_ij-g_k)^2+(b_ij-b_k)^2)
在下一步中,对于每个像素,选择表中最接近的颜色。这是用于压缩图像的概念索引颜色 https://en.wikipedia.org/wiki/Indexed_color(除了这里所有图像的调色板都是相同的,并且不会为每个图像计算调色板以最小化原始图像和索引图像之间的差异)。现在,正如 @jairoar 指出的那样,我们可以获得图像的直方图(不要与 RGB 直方图或强度直方图混淆),并确定重复次数最多的颜色。
为了展示这些步骤的结果,我使用了这件艺术品的随机裁剪!我的:
This is how images look, before and after indexing (left: original, right: indexed):
And these are most repeated colors (left: indexed, right: dominant color):
But since you said the number of images is large, you should know that these calculations are relatively time consuming. But the good news is that there are ways to increase the performance. For example, instead of using the Euclidean distance https://en.wikipedia.org/wiki/Euclidean_distance (formula above), you can use the City Block https://en.wikipedia.org/wiki/Taxicab_geometry or Chebyshev https://en.wikipedia.org/wiki/Chebyshev_distance distance. You can also calculate the distance only for a fraction of the pixels instead of calculating it for all the pixels in an image. For this purpose, you can first scale the image to a much smaller size (for example, 32 by 32) and perform calculations for the pixels of this reduced image. If you decided to resize images, don not bother to use bilinear or bicubic interpolations, it doesn't worth the extra computation. Instead, go for the nearest neighbor https://en.wikipedia.org/wiki/Image_scaling#Nearest-neighbor_interpolation, which actually performs a rectangular lattice https://en.wikipedia.org/wiki/Lattice_(group)#Lattices_in_two_dimensions:_detailed_discussion sampling on the original image.
虽然提到的改变会大大提高计算速度,但没有什么是免费的。这是性能与准确性的权衡。例如,在前两张图片中,我们看到最初被识别为橙色(代码 20)的图像在调整大小后已被识别为粉色(代码 26)。
为了确定算法的参数(距离测量、缩小图像尺寸和缩放算法),您必须首先以尽可能高的精度对大量图像执行分类操作,并将结果保留为基本事实。然后,通过多次实验,获得不使分类误差超过最大可容忍值的参数组合。