SMILE is an interdisciplinary research group gathering mathematicians, bio-informaticians and biologists.
SMILE is affiliated to the Institut de Biologie de l'ENS, in Paris.
SMILE is hosted within the CIRB (Center for Interdisciplinary Research in Biology) at Collège de France.
SMILE is supported by Collège de France and CNRS.
SMILE is hosted at Collège de France in the Latin Quarter of Paris. To reach us, go to 11 place Marcelin Berthelot (stations Luxembourg or Saint-Michel on RER B).
Our working spaces are rooms 107, 121 and 122 on first floor of building B1 (ask us for the code). Building B1 is facing you upon exiting the traversing hall behind Champollion's statue.


You can reach us by email (amaury.lambert - at - ; (guillaume.achaz - at - or (smile - at -

MOLD, a novel software to compile accurate and reliable DNA diagnoses for taxonomic descriptions

DNA data are increasingly being used for phylogenetic inference, and taxon delimitation and identification, but scarcely for the formal description of taxa, despite their undisputable merits in taxonomy. The uncertainty regarding the robustness of DNA diagnoses, however, remains a major impediment to their use. We have developed a new program, mold, that identifies diagnostic nucleotide combinations (DNCs) in DNA sequence alignments for selected taxa, which can be used to provide formal diagnoses of these taxa. To test the robustness of DNA diagnoses, we carry out iterated haplotype subsampling for selected query species in published DNA data sets of varying complexity. We quantify the reliability of diagnosis by diagnosing each query subsample and then checking if this diagnosis remains valid against the entire data set. We demonstrate that widely used types of diagnostic DNA characters are often absent for a query taxon or are not sufficiently reliable. We thus propose a new type of DNA diagnosis, termed "redundant DNC" (or rDNC), which takes into account unsampled genetic diversity, and constitutes a much more reliable descriptor of a taxon. mold successfully retrieves rDNCs for all but two species in the analysed data sets, even in those comprising hundreds of species. mold shows unparalleled efficiency in large DNA data sets and is the only available software capable of compiling DNA diagnoses that suit predefined criteria of reliability.



Cultural transmission of reproductive success impacts genomic diversity, coalescent tree topologies and demographic inferences

Cultural Transmission of Reproductive Success (CTRS) has been observed in many human populations as well as other animals. It consists in a positive correlation of non-genetic origin between the progeny size of parents and children. This correlation can result from various factors, such as the social influence of parents on their children, the increase of children{\textquoteright}s survival through allocare from uncle and aunts, or the transmission of resources. Here, we study the evolution of genomic diversity through time under CTRS. We show that CTRS has a double impact on population genetics: (1) effective population size decreases when CTRS starts, mimicking a population contraction, and increases back to its original value when CTRS stops; (2) coalescent trees topologies are distorted under CTRS, with higher imbalance and higher number of polytomies. Under long-lasting CTRS, effective population size stabilises but the distortion of tree topology remains, which yields U-shaped Site Frequency Spectra (SFS) under constant population size. We show that this CTRS{\textquoteright} impact yields a bias in SFS-based demographic inference. Considering that CTRS was detected in numerous human and animal populations worldwide, one should be cautious that inferring population past histories from genomic data can be biased by this cultural process.

