好的,就在这里。让我们从整数问题开始。最简单的方法是使用统计分布,它是自然产生的一组数字总和为一个固定值。并且有这样的分布——多项分布 https://en.wikipedia.org/wiki/Multinomial_distribution。它的固定总和等于n
,它提供从 0 到n
。因为要求采样间隔是任意的,所以首先我们将间隔移动到最小值为0,采样然后将其移回。请注意,采样值可能高于所需的最大值,因此我们必须使用拒绝技术,其中任何高于最大值的样本都将被拒绝,并且将对下一次尝试进行采样。
我们使用 Python/Numpy 的多项式采样。除了拒绝之外,您还可以添加独特性测试。代码,Python 3.7
import numpy as np
def sample_sum_interval(n: int, p, maxv: int):
while True:
q = np.random.multinomial(n, p, size=1)[0]
v = np.where(q > maxv)
if len(v[0]) == 0: # if len(v) > 0, some values are outside the range, reject
# test on unique if len(np.unique(q)) == len(q)
return q
return None
k = 6
min = -8
max = 13
sum = 13
n = sum - k*min # redefined sum
maxv = max - min # redefined max, min would be 0
p = np.full((k), np.float64(1.0)/np.float64(k), dtype=np.float64) # probabilities
q = sample_sum_interval(n, p, maxv) + min # back to original interval
print(q)
print(np.sum(q))
print(np.mean(q))
q = sample_sum_interval(n, p, maxv) + min
print(q)
print(np.sum(q))
print(np.mean(q))
q = sample_sum_interval(n, p, maxv) + min
print(q)
print(np.sum(q))
print(np.mean(q))
Output
[ 5 0 -2 3 3 4]
13
2.1666666666666665
[ 3 3 -1 2 1 5]
13
2.1666666666666665
[-4 0 0 3 10 4]
13
2.1666666666666665
为了将其转换为 JavaScript,您需要多项式采样或二项式采样,从二项式很容易转换为多项式。
UPDATE
好的,这是我不添加时的输出min
结果,总和为 61 (13+6*8),
范围 [0...21]
[11 7 6 8 9 20]
61
10.166666666666666
[ 5 10 14 13 14 5]
61
10.166666666666666
[ 9 12 7 15 7 11]
61
10.166666666666666
显然,有一个具有多项式采样的 Javascript 库 https://github.com/R-js/libRmath.js,其仿照R,
谢谢你@bluelovers
应该像这样在循环中调用它:
v = rmultinom(1, n, p);
进而v
应检查是否在范围 [0...maxv] 内,如果超出则接受或拒绝。
更新二
让我快速地(抱歉,真的没有时间,明天再说)描述一下我将如何为浮动做这件事的想法。在 [0...1] 范围内生成的一组数字也有类似的分布,称为狄利克雷分布 https://en.wikipedia.org/wiki/Dirichlet_distribution,并且它的总和始终为固定值 1。在 Python/Numpy 中,它是https://docs.scipy.org/doc/numpy-1.15.1/reference/ generated/numpy.random.dirichlet.html https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.dirichlet.html.
Suppose I sampled n
numbers di from Dirichlet and then map them to [min...max] interval:
xi = min + di*(max-min)
Then I sum them all, using property that all di sums to 1:
总和 = n*min + (max - min) = (n-1)*min + max
If Sum
is fixed, then it mean we have to redefine maximum value - lets call it sampling maxs.
So sampling procedure would be as following - sample n
[0...1] numbers from Dirichlet, map them to [min...maxs] interval, and check if any of those numbers are below desired max
(original, not redefined). If it is, you accept, otherwise reject, like in integer case.
代码如下
import numpy as np
def my_dirichlet(n: int):
"""
This is equivalent to numpy.random.dirichlet when all alphas are equal to 1
"""
q = np.random.exponential(scale = np.float64(1.0), size=n)
norm = 1.0/np.sum(q)
return norm * q
def sample_sum_interval(n: int, summa: np.float64, minv: np.float64, maxv: np.float64):
maxs = summa - np.float64(n-1)*minv # redefine maximum value of the interval is sum is fixed
alpha = np.full(n, np.float64(1.0), dtype=np.float64)
while True:
q = my_dirichlet(n) # q = np.random.dirichlet(alpha)
q = minv + q*(maxs - minv) # we map it to [minv...maxs]
v = np.where(q > maxv) # but we need it in the [minv...maxv], so accept or reject test
if len(v[0]) == 0: # if len(v) > 0, some values are outside the range, reject, next sample
return q
return None
n = 5
min = np.float64(-2.0)
max = np.float64(3.0)
sum = np.float64(1.0)
q = sample_sum_interval(n, sum, min, max)
print(q)
print(np.sum(q))
q = sample_sum_interval(n, sum, min, max)
print(q)
print(np.sum(q))
q = sample_sum_interval(n, sum, min, max)
print(q)
print(np.sum(q))
我已经放置了标准 NumPy Dirichlet 采样以及自定义 Dirichlet 采样。显然,libRmath.js有指数分布采样,但没有狄利克雷,但可以用用户定义的代码和指数替换。请记住,NumPy 使用单个运算符对向量进行运算,循环是隐式的。
Output:
[-0.57390094 -1.80924001 0.47630282 0.80008638 2.10675174]
1.0000000000000013
[-1.12192892 1.18503129 0.97525135 0.69175429 -0.73010801]
0.9999999999999987
[-0.34803521 0.36499743 -1.165332 0.9433809 1.20498888]
0.9999999999999991