- 借助
tf.nn.fixed_unigram_candidate_sampler
:用参数unigrams
表示每个类的可能性权重,即分布。分布可以用整数表示import tensorflow as tf
import numpy as np
sess = tf.Session()
V = tf.constant( np.array( [[ 10, 30, 20, 50 ]]), dtype=tf.int64)
sampled_ids, true_expected_count, sampled_expected_count = tf.nn.fixed_unigram_candidate_sampler(
true_classes = V, #待采样真实数据
num_true = 4, # 待采样真实数据类别总数
num_sampled = 50,#采样个数
unique = False,
range_max = 4,
unigrams = [ 20, 30, 10, 40 ] # this is P, times 100
) # 返回的是采样的数据id,即索引
# 根据索引提取真实值
sample = tf.gather( V[ 0 ], sampled_ids )
x = sess.run( sample )
print( x )
- 使用
tf.nn.fixed_unigram_candidate_sampler
,但是分布用浮点数表示import tensorflow as tf
import numpy as np
sess = tf.Session()
k = 50 # number of samples you want
V = tf.constant( [ 10, 30, 20, 50 ], dtype = tf.float32 ) # values
P = tf.constant( [ 0.2, 0.3, 0.1, 0.4 ], dtype = tf.float32 ) # prob dist
cum_dist = tf.cumsum( P ) # create cumulative probability distribution
# get random values between 0 and the max of cum_dist
# we'll determine where it is in the cumulative distribution
rand_unif = tf.random_uniform( shape=( k, ), minval = 0.0, maxval = tf.reduce_max( cum_dist ), dtype = tf.float32 )
# create boolean to signal where the random number is greater than the cum_dist
# take advantage of broadcasting to create Cartesian product
greater = tf.expand_dims( rand_unif, axis = -1 ) > tf.expand_dims( cum_dist, axis = 0 )
# we get the indices by counting how many are greater in any given row
idxs = tf.reduce_sum( tf.cast( greater, dtype = tf.int64 ), 1 )
# then just gather the sample from V by the indices
sample = tf.gather( V, idxs )
# run, output
print( sess.run( sample ) )
- 借助
tf.distributions.Categorical()
5# Probability distribution
P = [0.2, 0.3, 0.1, 0.4]
# Vector of values
V = [10, 30, 20, 50]
# Define categorical distribution
dist = tf.distributions.Categorical(probs=P)
# Generate a sample from categorical distribution - this serves as an index
index = dist.sample().eval()
# Fetch the value at V[index] as the sample
sample = V[index]
以上代码可以用如下一行代码表示:sample = V[tf.distributions.Categorical(probs=P).sample().eval()]
如果想要从分布P中采样K个样本,则可用如下代码实现:samples = [ V[tf.distributions.Categorical(probs=P).sample().eval()] for i in range(K) ]
划掉的部分代码是低效的,更高效的方法是可以设置sample
的参数sample_shape
,采样sample_shape
大小的样本:samples = dist.sample(sample_shape=[17,17]) #采样[17,17]大小的矩阵,每个元素都是从分布P中随机采样得到的
- 借助
tf.multinomial
采样,参考67elems = tf.convert_to_tensor([1,2,3,5])
samples = tf.multinomial(tf.log([[0.1, 0, 0.3, 0.6]]), 1) # note log-probability
elems[tf.cast(samples[0][0], tf.int32)].eval()
Out: 1
elems[tf.cast(samples[0][0], tf.int32)].eval()
Out: 5
tf.random.multinomial(
logits, # log-probabilities
num_samples, #采样个数,必须是整数
seed=None,
name=None,
output_dtype=None
)