1 Answers
Allele age is the amount of time elapsed since an allele first appeared due to mutation. Estimating the time at which a certain allele appeared allows researchers to infer patterns of human migration, disease, and natural selection. Allele age can be estimated based on the frequency of the allele in a population and the genetic variation that occurs within different copies of the allele, also known as intra-allelic variation. While either of these methods can be used to estimate allele age, the use of both increases the accuracy of the estimation and can sometimes offer additional information regarding the presence of selection.
Estimating allele age based on the allele’s frequency is based on the fact that alleles in high frequency are older than alleles in low frequency. Of course, many alleles of interest are under some type of selection. Because alleles that are under positive selection can rise to high frequency very quickly, it is important to understand the mechanisms that underlie allele frequency change, such as natural selection, gene flow, genetic drift, and mutation.
Estimating allele age based on intra-allelic variation is based on the fact that with every generation, linkage with other alleles is disrupted by recombination and new variation in linkage is created via new mutations. The analysis of intra-allelic variation to assess allele age depends on coalescent theory. There are two different approaches that can be used to analyze allele age based on intra-allelic variation. First, a phylogenetics approach extrapolates an allele’s age by reconstructing a gene tree and dating the root of the tree. This approach is best when analyzing ancient, as opposed to recent, mutations. Second, a population genetics approach estimates allele age by using mutation, recombination, and demography models instead of a gene tree. This type of approach is best for analyzing recent mutations.
Recently, Albers and McVean proposed a non-parametric method to estimate the age of an allele, using probabilistic, coalescent-based models of mutation and recombination. Specifically, their method infers the time to the most recent common ancestor between hundreds or thousands of chromosomal sequence pairs. This information is then combined using a composite likelihood approach to obtain an estimate of the time of mutation at a single locus. This methodology was applied to more than 16 million variants in the human genome, using data from the 1000 Genomes Project and the Simons Genome Diversity Project, to generate the atlas of variant age.