【Alias Method for Sampling】原理
对于处理离散分布的随机变量的取样问题,Alias Method for Sampling 是一种很高效的方式。
在初始好之后,每次取样的复杂度为 O(1)。
、、、
【Python 代码】
# !/usr/bin/env python# encoding: utf-8__author__ = 'ScarlettZero'# 20180522# AliasMethod Samplingimport timeimport numpy as npimport pandas as pdimport numpy.random as nprdef alias_setup(probs): ''' :param probs: 某个概率分布 :return: Alias数组与Prob数组 ''' K =len(probs) # K为类别数目 Prob =np.zeros(K) # 对应Prob数组:落在原类型的概率 Alias =np.zeros(K,dtype=np.int) # 对应Alias数组:每一列第二层的类型 #Sort the data into the outcomes with probabilities #that are larger and smaller than 1/K smaller =[] # 存储比1小的列 larger =[] # 存储比1大的列 for kk,prob in enumerate(probs): Prob[kk] =K*prob # 概率(每个类别概率乘以K,使得总和为K) if Prob[kk] <1.0: # 然后分为两类:大于1的和小于1的 smaller.append(kk) else: larger.append(kk) # Loop though and create little binary mixtures that appropriately allocate # the larger outcomes over the overall uniform mixture. #通过拼凑,将各个类别都凑为1 while len(smaller) > 0 and len(larger) > 0: small = smaller.pop() large = larger.pop() Alias[small] = large #填充Alias数组 Prob[large] = Prob[large]-(1.0 - Prob[small]) #将大的分到小的上 if Prob[large] <1.0: smaller.append(large) else: larger.append(large) print("Prob is :", Prob) print("Alias is :", Alias) return Alias,Probdef alias_draw(Alias,Prob): ''' :param J: Alias数组 :param q: Prob数组 :return:一次采样结果 ''' K=len(Alias) # Draw from the overall uniform mixture. kk = int(np.floor(npr.rand()*K)) #随机取一列 # Draw from the binary mixture, either keeping the small one, or choosing the associated larger one. # 采样过程:随机取某一列k(即[1,4]的随机整数,再随机产生一个[0-1]的小数c,) # 如果Prob[kk]大于c, if npr.rand()
运行结果:
【Reference】
1、
2、