SMILE

Stochastic Models for the Inference of Life Evolution

Accuracy of demographic inferences from Site Frequency Spectrum: The case of the Yoruba population

Lapierre, M., Lambert, A., Achaz, G.

2016

Demographic inferences based on the observed genetic diversity of current populations rely on the use of summary statistics such as the Site Frequency Spectrum (SFS). Demographic models can be either model-constrained with numerous parameters such as growth rates, timing of demographic events and migration rates, or model-flexible, with an unbounded collection of piecewise constant sizes. It is still debated whether demographic histories can be accurately inferred based on the SFS. Here we illustrate this theoretical issue on an example of demographic inference for an African population. The SFS of the Yoruba population (data from the 1000 Genomes Project) fits to a simple model of population growth described with a single parameter (e.g., foundation time). We infer a time to the most recent common ancestor of 1.7 million years for this population. However, we show that the Yoruba SFS is not informative enough to discriminate between several different models of growth. We also show that for such simple demographies, the fit of one-parameter models outperforms the model-flexible method recently developed by Liu and Fu. The use of this method on simulated data suggests that it tends to overfit the noise intrinsically present in the data.

Bibtex

@article{Lapierre078618,
author = {Lapierre, Marguerite and Lambert, Amaury and Achaz, Guillaume},
title = {Accuracy of demographic inferences from Site Frequency Spectrum: The case of the Yoruba population},
year = {2016},
doi = {10.1101/078618},
publisher = {Cold Spring Harbor Labs Journals},
abstract = {Demographic inferences based on the observed genetic diversity of current populations rely on the use of summary statistics such as the Site Frequency Spectrum (SFS). Demographic models can be either model-constrained with numerous parameters such as growth rates, timing of demographic events and migration rates, or model-flexible, with an unbounded collection of piecewise constant sizes. It is still debated whether demographic histories can be accurately inferred based on the SFS. Here we illustrate this theoretical issue on an example of demographic inference for an African population. The SFS of the Yoruba population (data from the 1000 Genomes Project) fits to a simple model of population growth described with a single parameter (e.g., foundation time). We infer a time to the most recent common ancestor of 1.7 million years for this population. However, we show that the Yoruba SFS is not informative enough to discriminate between several different models of growth. We also show that for such simple demographies, the fit of one-parameter models outperforms the model-flexible method recently developed by Liu and Fu. The use of this method on simulated data suggests that it tends to overfit the noise intrinsically present in the data.},
URL = {http://biorxiv.org/content/early/2016/09/30/078618},
eprint = {http://biorxiv.org/content/early/2016/09/30/078618.full.pdf},
journal = {bioRxiv}
}

Link to the article

Accéder à l'article grâce à son DOI.