概率密度
P
^
(
x
)
\hat{P}(x)
P^(x)
=
w
1
N
(
x
;
μ
1
,
σ
1
2
)
+
w
2
N
(
x
;
μ
2
,
σ
2
2
)
= w_1N(x;\mu_1, \sigma_1^2)+w_2N(x;\mu_2, \sigma_2^2)
=w1N(x;μ1,σ12)+w2N(x;μ2,σ22)
=
1
/
3
N
(
x
;
0.25
,
0.2
5
2
)
+
2
/
3
N
(
x
;
2.5
,
1.25
)
= 1/3N(x;0.25, 0.25^2)+2/3N(x;2.5, 1.25)
=1/3N(x;0.25,0.252)+2/3N(x;2.5,1.25)
查表知:
P
^
(
x
1
)
=
0.342
\hat{P}(x_1) = 0.342
P^(x1)=0.342
P
^
(
x
2
)
=
0.371
\hat{P}(x_2) = 0.371
P^(x2)=0.371
P
^
(
x
3
)
=
0.103
\hat{P}(x3) = 0.103
P^(x3)=0.103
P
^
(
x
4
)
=
0.215
\hat{P}(x4) = 0.215
P^(x4)=0.215
P
^
(
x
5
)
=
0.215
\hat{P}(x5) = 0.215
P^(x5)=0.215
P
^
(
x
6
)
=
0.097
\hat{P}(x6) = 0.097
P^(x6)=0.097
E
i
n
=
−
∑
n
=
1
6
l
o
g
P
^
(
x
n
)
=
9.748
E_{in} = -\sum\limits_{n=1}^{6}log\hat{P}(x_n) = 9.748
Ein=−n=1∑6logP^(xn)=9.748
2)使用EM算法迭代直至收敛,需要写出每一步的聚类结果和模型参数
E-step:
计算归属度
λ
i
j
\lambda_{ij}
λij,表示数据点
i
i
i对簇
j
j
j的归属度。
λ
i
j
=
w
j
N
(
x
i
;
μ
j
,
σ
j
2
)
\lambda_{ij} = w_jN(x_i; \mu_j, \sigma_j^2)
λij=wjN(xi;μj,σj2)
(3)同理
x
2
x_2
x2归属到簇1,
x
4
,
x
5
,
x
6
x_4,x_5,x_6
x4,x5,x6归属到簇2 所以最终我们还是得到
B
1
=
{
x
1
,
x
2
}
,
B
2
=
{
x
3
,
x
4
,
x
5
,
x
6
}
B1 = \{x1, x2\}, B2 = \{x3,x4,x5,x6\}
B1={x1,x2},B2={x3,x4,x5,x6}. 也就是说已经收敛,解答完毕。
但如果还没有收敛怎么办? 假设得到了
B
1
=
{
x
1
,
x
2
,
x
3
}
,
B
2
=
{
x
4
,
x
5
,
x
6
}
B1 = \{x_1, x_2, x_3\}, B2 = \{x_4,x_5,x_6\}
B1={x1,x2,x3},B2={x4,x5,x6} 那就再通过M步更新高斯模型参数
计算归属度
λ
i
j
\lambda_{ij}
λij,表示数据点
i
i
i对簇
j
j
j的归属度。
λ
i
j
=
w
j
N
(
x
i
;
μ
j
,
σ
j
2
)
\lambda_{ij} = w_jN(x_i; \mu_j, \sigma_j^2)
λij=wjN(xi;μj,σj2)
两个簇权重分别为
w
1
=
∑
i
=
1
6
λ
i
1
=
0.312
,
w
2
=
∑
i
=
1
6
λ
i
2
=
0.688
w_1 = \sum\limits_{i=1}^{6}\lambda_{i1} = 0.312, w_2 = \sum\limits_{i=1}^{6}\lambda_{i2} = 0.688
w1=i=1∑6λi1=0.312,w2=i=1∑6λi2=0.688 两个簇的高斯分布为
μ
1
=
∑
i
=
1
6
λ
i
1
x
i
∑
i
=
1
6
λ
i
1
=
(
λ
11
x
1
+
λ
21
x
2
+
.
.
.
+
λ
61
x
6
)
λ
11
+
λ
21
+
.
.
.
+
λ
61
=
0.263
\mu_1 = \frac{\sum\limits_{i=1}^{6}\lambda_{i1}x_i}{\sum\limits_{i=1}^{6}\lambda_{i1}} = \frac{(\lambda_{11}x_1 + \lambda_{21}x_2 + ... + \lambda_{61}x_6) }{\lambda_{11}+\lambda_{21}+...+\lambda_{61}}= 0.263
μ1=i=1∑6λi1i=1∑6λi1xi=λ11+λ21+...+λ61(λ11x1+λ21x2+...+λ61x6)=0.263
μ
2
=
∑
i
=
1
6
λ
i
2
x
i
∑
i
=
1
6
λ
i
2
=
2.424
\mu_2 = \frac{\sum\limits_{i=1}^{6}\lambda_{i2}x_i}{\sum\limits_{i=1}^{6}\lambda_{i2}} =2.424
μ2=i=1∑6λi2i=1∑6λi2xi=2.424
σ
1
2
=
∑
i
=
1
6
λ
i
1
(
x
i
−
μ
1
)
2
∑
i
=
1
6
λ
i
1
=
0.279
\sigma_1^2 = \frac{\sum\limits_{i=1}^{6}\lambda_{i1}(x_i-\mu_1)^2}{\sum\limits_{i=1}^{6}\lambda_{i1}} = 0.279
σ12=i=1∑6λi1i=1∑6λi1(xi−μ1)2=0.279
σ
2
2
=
∑
i
=
1
6
λ
i
2
(
x
i
−
μ
2
)
2
∑
i
=
1
6
λ
i
2
=
1.180
\sigma_2^2 = \frac{\sum\limits_{i=1}^{6}\lambda_{i2}(x_i-\mu_2)^2}{\sum\limits_{i=1}^{6}\lambda_{i2}} = 1.180
σ22=i=1∑6λi2i=1∑6λi2(xi−μ2)2=1.180
概率密度
P
^
(
x
)
\hat{P}(x)
P^(x)
=
w
1
N
(
x
;
μ
1
,
σ
1
2
)
+
w
2
N
(
x
;
μ
2
,
σ
2
2
)
= w_1N(x;\mu_1, \sigma_1^2)+w_2N(x;\mu_2, \sigma_2^2)
=w1N(x;μ1,σ12)+w2N(x;μ2,σ22)
=
0.312
N
(
x
;
0.263
,
0.279
)
+
0.688
N
(
x
;
2.424
,
1.180
)
= 0.312N(x;0.263, 0.279) + 0.688N(x;2.424, 1.180)
=0.312N(x;0.263,0.279)+0.688N(x;2.424,1.180)
[1]《Dempster A P . Maximum likelihood from incomplete data via the EM algorithm[J]. Journal of the Royal Statistical Society, 1977, 39.》 [2]《高斯混合模型(GMM)》 [3]《EM算法python复现 - 高斯混合模型》