Let be a well defined infinite discrete probability distribution (e.g., a draw from Dirichlet process (DP)). We are interested in evaluating the following form of expectations: for some function (we are especially interested when , which gives us Shannon’s entropy). Following , we can re-write it as
where is a random variable that takes the value with probability . This random variable is better known as the first size-biased sample . It is defined by . In other words, it takes one of the probabilities among with probability .
For Pitman-Yor process (PY) with discount parameter and concentration parameter (Dirichlet process is a special case where ), the size biased samples are naturally obtained by the stick breaking construction. Given a sequence of independent random variables distributed as , if we define , then the set of is invariant to size biased permutation , and they form a sequence of size-biased samples. In our case, we only need the first size biased sample which is simply distributed as .
Using this trick, we can compute the entropy of PY without the complicated simplex integrals. We used this and its extension for computing the PY based entropy estimator.
- Jim Pitman, Marc Yor. The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator. The Annals of Probability, Vol. 25, No. 2. (April 1997), pp. 855-900, doi:10.1214/aop/1024404422
- Mihael Perman, Jim Pitman, Marc Yor. Size-biased sampling of Poisson point processes and excursions. Probability Theory and Related Fields, Vol. 92, No. 1. (21 March 1992), pp. 21-39, doi:10.1007/BF01205234