An asymptotic sampling formula for the coalescent with recombination

June, 2009
Report Number: 
Paul A. Jenkins & Yun S. Song

Ewens sampling formula (ESF) is a one-parameter family of probability distributions with a number of intriguing combinatorial connections. This elegant closed-form formula first arose in biology as the stationary probability distribution of a sample configuration at one locus under the infinite-alleles model of mutation. Since its discovery in the early 1970s, the ESF has been used in various biological applications, and has sparked several interesting mathematical generalizations. In the population genetics community, extending the underlying random-mating model to include recombination has received much attention in the past, but no general closed-form sampling formula is currently known even for the simplest extension, that is, a model with two loci. In this paper, we show that it is possible to obtain useful closed-form results in the case the population-scaled recombination rate rho is large but not necessarily infinite. Specifically, we consider an asymptotic expansion of the two-locus sampling formula in inverse powers of rho and obtain closed-form expressions for the first few terms in the expansion. Our asymptotic sampling formula applies to arbitrary sample sizes and configurations.

PDF File: