Let be a well defined infinite discrete probability distribution (e.g., a draw from Dirichlet process (DP)). We are interested in evaluating the following form of expectations: for some function (we are especially interested when , which gives us Shannon’s entropy). Following [1], we can re-write it as

where is a random variable that takes the value with probability . This random variable is better known as the **first size-biased sample** . It is defined by . In other words, it takes one of the probabilities among with probability .

For Pitman-Yor process (PY) with discount parameter and concentration parameter (Dirichlet process is a special case where ), the size biased samples are naturally obtained by the **stick breaking** construction. Given a sequence of independent random variables distributed as , if we define , then the set of is invariant to size biased permutation [2], and they form a sequence of size-biased samples. In our case, we only need the first size biased sample which is simply distributed as .

Using this trick, we can compute the entropy of PY without the complicated simplex integrals. We used this and its extension for computing the PY based entropy estimator.

- Jim Pitman, Marc Yor.
**The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator.**The Annals of Probability, Vol. 25, No. 2. (April 1997), pp. 855-900, doi:10.1214/aop/1024404422 - Mihael Perman, Jim Pitman, Marc Yor.
**Size-biased sampling of Poisson point processes and excursions.**Probability Theory and Related Fields, Vol. 92, No. 1. (21 March 1992), pp. 21-39, doi:10.1007/BF01205234