Let be a well defined infinite discrete probability distribution (e.g., a draw from Dirichlet process (DP)). We are interested in evaluating the following form of expectations:
for some function
(we are especially interested when
, which gives us Shannon’s entropy). Following [1], we can re-write it as
where is a random variable that takes the value
with probability
. This random variable
is better known as the first size-biased sample
. It is defined by
. In other words, it takes one of the probabilities
among
with probability
.
For Pitman-Yor process (PY) with discount parameter and concentration parameter
(Dirichlet process is a special case where
), the size biased samples are naturally obtained by the stick breaking construction. Given a sequence of independent random variables
distributed as
, if we define
, then the set of
is invariant to size biased permutation [2], and they form a sequence of size-biased samples. In our case, we only need the first size biased sample which is simply distributed as
.
Using this trick, we can compute the entropy of PY without the complicated simplex integrals. We used this and its extension for computing the PY based entropy estimator.
- Jim Pitman, Marc Yor. The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator. The Annals of Probability, Vol. 25, No. 2. (April 1997), pp. 855-900, doi:10.1214/aop/1024404422
- Mihael Perman, Jim Pitman, Marc Yor. Size-biased sampling of Poisson point processes and excursions. Probability Theory and Related Fields, Vol. 92, No. 1. (21 March 1992), pp. 21-39, doi:10.1007/BF01205234