如何使用python分离两条高斯曲线?

2024-04-15

I measured the fluorescence intensity of thousands of particles and made the histogram, which showed two adjacent gaussian curves. How to use python or its package to separate them into two Gaussian curves and make two new plots?enter image description here

谢谢。


基本上,您需要推断高斯混合的参数。我将为插图生成一个类似的数据集。

生成具有已知参数的混合物

from itertools import starmap

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib import mlab
sns.set(color_codes=True)
# inline plots in jupyter notebook
%matplotlib inline


# generate synthetic data from a mixture of two Gaussians with equal weights
# the solution below readily generalises to more components 
nsamples = 10000
means = [30, 120]
sds = [10, 50]
weights = [0.5, 0.5]
draws = np.random.multinomial(nsamples, weights)
samples = np.concatenate(
    list(starmap(np.random.normal, zip(means, sds, draws)))
)

绘制分布图

sns.distplot(samples)

推断参数

from sklearn.mixture import GaussianMixture

mixture = GaussianMixture(n_components=2).fit(samples.reshape(-1, 1))
means_hat = mixture.means_.flatten()
weights_hat = mixture.weights_.flatten()
sds_hat = np.sqrt(mixture.covariances_).flatten()

print(mixture.converged_)
print(means_hat)
print(sds_hat)
print(weights_hat)

We get:

True
[ 122.57524745   29.97741112]
[ 48.18013893  10.44561398]
[ 0.48559771  0.51440229]

您可以调整 GaussianMixture 超参数来改善拟合,但这看起来足够好了。现在我们可以绘制每个组件(我只绘制第一个):

mu1_h, sd1_h = means_hat[0], sds_hat[0]
x_axis = np.linspace(mu1_h-3*sd1_h, mu1_h+3*sd1_h, 1000)
plt.plot(x_axis, mlab.normpdf(x_axis, mu1_h, sd1_h))

P.S.

旁注。看起来您正在处理受约束的数据,并且您的观察结果非常接近左侧约束(零)。虽然高斯模型可能足够好地逼近您的数据,但您应该谨慎行事,因为高斯模型假定不受约束的几何形状。

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

如何使用python分离两条高斯曲线? 的相关文章

随机推荐