Rosenberg lab at Stanford University

We are a mathematical, theoretical, and computational lab in genetics and evolution. Research in the lab addresses problems in evolutionary biology and human genetics through a combination of mathematical modeling, computer simulations, development of statistical methods, and inference from population-genetic data. Read more...


  • 11-14-2016 — Postdoc Lawrence Uricchio has reported an upper bound on the size of gene tree sets required before all splits of a species tree appear in a gene tree set with a specified probability. His upper bound depends on a single parameter — the shortest internal branch in the species tree. The computation extends the lab's work on methods for species tree inference from gene trees.

  • 10-14-2016 — Recent PhD graduate Doc Edge has devised a general mathematical model to understand how genotypic differences between populations contribute to phenotypic differences between populations. He uses the model to analyze the relationship of genetics to "health disparities," concluding that health disparities that all trend in the same direction are incompatible with neutral genetic explanations. The work extends a simpler model of Doc's [129], allowing for diploidy, genetic drift, and general distributions of allele frequencies.

  • 10-7-2016 — Postdoc Filippo Disanto continues the lab's work on coalescent histories with a study of the number of coalescent histories for matching gene trees in caterpillar-like families of species trees. Filippo's work solves an open problem from earlier work in the lab [111], showing that the number of coalescent histories is asymptotic to a constant multiple of the Catalan numbers. He uses clever iterative enumerations and techniques of analytic combinatorics to obtain the result. See also [41], [68], and [135] for related work.

  • 7-27-2016 — We are pleased to announce that the software MONOPHYLER is now available. MONOPHYLER computes probabilities that sets of lineages are monophyletic, both for general species trees and for trees of small size. MONOPHYLER is reported by PhD student Rohan Mehta. The software encodes formulas from Rohan's recent Proceedings of the National Academy of Sciences paper.

  • 7-22-2016 — We congratulate PhD student Doc Edge on his thesis defense, "Pick up the pieces: combining information from multiple genetic loci." Doc's thesis examines several problems in the mathematical modeling of the genotype-to-phenotype relationship in structured populations, mathematical properties of the Fst measure of genetic differentiation, and population-genetic aspects of forensic DNA testing and genetic association studies. Doc has been recognized with the Samuel Karlin Prize in Mathematical Biology, awarded by the Department of Biology. Congratulations Doc!

  • 7-19-2016 — PhD student Rohan Mehta reports a computation of the probability that a set of gene lineages on an arbitrary species tree. The work generalizes earlier studies from the lab that considered trees of only two or three species. Rohan illustrates the new formula with an application in maize. The study is a contribution to the Comparative Phylogeography volume of the "In the Light of Evolution" special issue series of Proceedings of the National Academy of Sciences USA.

  • 6-27-2016 — We congratulate biology MS student Brian Donovan on the completion of his PhD in science education "An experimental exploration of how text-based instruction in school biology affects belief in genetic essentialism of race in adolescent populations." Brian defended his PhD in the Graduate School of Education on May 26. He is continuing his studies as a postdoctoral fellow at the Biological Sciences Curriculum Study in Colorado Springs.

  • 6-17-2016 — The lab reports a study examining the predicted distribution of gene tree shape under a birth-death model of species divergence. The work suggests that gene trees are expected to be more imbalanced than species trees, potentially providing part of the explanation for an excess of imbalance observed in inferred phylogenies.

  • 6-15-2016 — Congratulations to Amy Goldberg and Jaehee Kim, who have received fellowships for 2016-2017 from the Stanford Center for Computational, Evolutionary, and Human Genomics!

  • 5-12-2016 — Lab alumnus Mike DeGiorgio reports on the consistency properties of species tree inference methods in a model with ancestral population structure. By introducing a model that includes population subdivision in ancestral species, his paper introduces a new direction for studying consistency in species tree inference. The work is related to several recent papers from the lab on consistency of species tree methods ([85], [88], [89], [97], [109])

  • 4-22-2016 — Several projects from the lab have been in the news:

  • 4-5-2016 — We congratulate PhD student Amy Goldberg on the publication of her Nature article entitled "Post-invasion demography of prehistoric humans in South America." In this work, Amy and her colleagues use the locations and dates of South American archaeological sites to estimate the time trajectory of the human population size history of the continent. Read the news story here.

  • 4-4-2016 — Lab members Bridget Algee-Hewitt, Doc Edge, and Jaehee Kim report that forensic genetic markers selected for their use in individual identification possess a surprising level of information about genetic ancestry. Moreover, their study finds that a general correlation holds for genetic markers between their information about individual identity and ancestry information. The result makes use of theory from the lab on the connection between measures of genetic diversity and genetic differentiation ([102], [121]).

  • 1-5-2016 — The lab helps celebrate the centennial of the journal Genetics!

    When PhD student Amy Goldberg develops a model for sex-biased admixture on the X-chromosome, a curious mathematical sequence leads to an unexpected connection deep in the Genetics archive.

    Read about the oscillatory functions and coupled recursions encountered in this scholarly adventure — with a surprise appearance of the Fibonacci numbers.

  • 10-7-2015 — PhD student Jonathan Kang has analyzed a new approach for prioritizing individuals for whole-genome sequencing. This approach, based on minimizing a quantity the average distance to the closest leaf, seeks to identify a set of samples that will provide optimal templates for imputing genotypes in additional individuals. He compares the method to an earlier algorithm, also from the lab: maximizing phylogenetic diversity ([108]). Jonathan's article has been selected for Genetics issue highlights.

  • 9-30-2015 — Postdoc Filippo Disanto reports a study of the number of coalescent histories for gene trees and species trees in the lodgepole family. He uses connections with other combinatorial structures from theoretical computer science to derive exact results in the context of a new problem arising from biology. The term "lodgepole" for the tree shape he considers is based on a resemblance to the pattern in which lodgepole pine needles branch off the main twig. The work follows earlier studies from the lab on coalescent histories ([41], [68], [111]).

  • 9-24-2015 — We report an article on detecting selective sweeps using a new statistic, the haplotype allele frequency (HAF) score. This statistic tabulates the frequencies of alleles on a haplotype, and it has distinctive patterns of change during a selective sweep. The approach is related to previous articles from the lab that examined haplotype properties for detecting a deviation from null population-genetic models ([23] [127]).

  • 9-9-2015 — The lab reports two articles in this month's issue of Genetics.
    • PhD student Amy Goldberg reports a study of the effect of sex-biased admixture on the X chromosome. The study has a number of surprises: (1) The admixture level on the X chromosome is not simply a 2/3-and-1/3 linear combination of female and male parameters. (2) A difference in X chromosomal and autosomal levels of admixture need not imply male bias entering the admixed population from one source and female bias from a second source: the bias can be in the same direction in both source populations, but with different magnitudes. (3) A third surprise involves the appearance of a sequence related to the Fibonacci numbers! The paper follows two previous articles from the lab on mechanistic models of admixture ([82], [122]).

    • We report a review of three cases in which differences in levels of genetic diversity across populations contribute to population differences in societal variables — related to forensic testing, transplantation matching, and genome-wide association. The study also considers a fourth scenario, performing a reanalysis that contests a claim that within-population genetic diversity has influenced global economic development. PhD student Jonathan Kang contributed to the project. [Genes to Genomes blog post from the Genetics Society of America]

  • 8-11-2015 — Our program note reporting the software CLUMPAK is now available. What does the program do? It clumps and packages results from Structure and related programs. What does it produce? A pack of CLUMPPed Distruct plots — a clumpak! CLUMPAK is reported by PhD graduate Naama Kopelman. Former postdoc Mattias Jakobsson contributed to the project.

  • 7-24-2015 — PhD student Doc Edge uses a mathematical model to interpret the implications of two computations in population genetics---the partition of genetic variance, and the genetic assignment of individual ancestry---for human phenoypic differentiation. He concludes that a typical selectively neutral quantitative phenotype is comparable to a single genetic locus in terms of its ancestry information. The study is part of a special issue of Studies in History and Philosophy of Biological and Biomedical Sciences on Genomics and Philosophy of Race.

  • 5-22-2015 — We congratulate three lab members who have recently been awarded competitive fellowships!
    • Nicolas Alcala — Swiss National Science Foundation Early Postdoc.Mobility Fellowship (2015-2017).
    • Amy Goldberg — Achievement Rewards for College Scientists Fellowship from the Northern California Chapter of the ARCS Foundation (2015-2016).
    • Lawrence Uricchio — Stanford Center for Computational, Evolutionary, and Human Genomics Postdoctoral Fellowship (2015-2016).

  • 5-15-2015 — PhD student Nandita Garud from Dmitri Petrov's lab next door reports on the mathematical properties of statistics used in detecting soft selective sweeps. Nandita provides an improvement to the use of proposed statistics H12 and H2/H1, applying the modified statistics to analyze selection in Drosophila. The work relies on the lab's mathematical analysis of homozygosity and the frequency of the most frequent allele ([52], [87]).

  • 5-14-2015 — We are pleased to congratulate the lab's Administrative Associate Elena Yujuico on receiving the Humanities & Sciences Dean's Award of Merit! This award recognizes staff members who make outstanding contributions in the School of Humanities & Sciences.

  • 2-19-2015 — A news story in the Stanford Medicine SCOPE blog discusses with Noah the journal Theoretical Population Biology, for which he serves as the Editor-in-Chief.

  • 2-3-2015 — A new study by Nicole Creanza et al. performs the largest joint analysis of genetic variation and phonemic variation in populations worldwide. The study uncovers a number of new coevolutionary patterns in genes and languages, including correspondences in spatial axes of genetic and linguistic diversity, and a difference for genes and languages in the effects of population isolation. Former postdoc Trevor Pemberton contributed to the project. [Ars Technica] [The Atlantic] [] [Quartz] [Stanford Report] [Nature Reviews Genetics] [PNAS commentary by Keith Hunley]

  • 12-10-2014Filippo Disanto reports a probabilistic result about anomalous ranked gene trees (ARGTs), demonstrating that as the number of species increases, the fraction of ranked species trees that produces ARGTs approaches 1. The work extends earlier existence results on ARGTs ([85], [97]).

  • 12-3-2014 — Approximate Bayesian Computation (ABC) provides a way of performing statistical inference from complex models that can be simulated but for which likelihoods are difficult to evaluate. Former postdoc Erkan Buzbas reports a new advance in ABC techniques for scenarios in which even simulating from the model is challenging — Approximate Approximate Bayesian Computation. Read about it here!

  • 11-7-2014 — PhD student Amy Goldberg reports a surprising result, that properties of admixture obtained from autosomal loci alone can be informative about sex bias in the history of admixture. The result is obtained in a new article in the November 2014 issue of Genetics. It builds on an earlier model studied by former postdoc Paul Verdu, who is also a contributor to the project.

  • 10-1-2014Nicolas Alcala joins us as a new postdoc. Nicolas completed his PhD in ecology and evolution at the University of Lausanne, performing several studies in the population-genetic modeling of demography and population structure. We are pleased to welcome Nicolas to the group!

  • 9-12-2014 — PhD student Doc Edge reports a new paper on the mathematical properties of population-genetic statistic FST. Doc has refined the bounds on FST as functions of the frequency of the most frequent allele and homozygosity obtained in an earlier study from the lab, considering a finitely-many-alleles case instead of the less constrained infinitely-many-alleles case. The work extends the lab's line of work on mathematical properties of population-genetic statistics.

  • 8-28-2014 — We welcome Ilana Arbisser as a PhD student in the lab. Ilana completed her BA at the University of Pennsylvania, where she majored in biology with a concentration in biological mathematics. Ilana rotated through the lab during the spring quarter, working on problems in coalescent theory. Welcome Ilana!

  • 8-15-2014 — A new study in PLoS Genetics led by former postdoc Paul Verdu reports on admixture in Native American and First Nation populations of the Pacific Northwest. The study describes recent European admixture in coastal and inland populations from British Columbia and Alaska, also uncovering evidence of recent East Asian admixture in the inland groups. It is the first genomic investigation focused on the Pacific Northwest region. Former postdoc Trevor Pemberton was a contributor to the project.

  • 8-8-2014 — We congratulate PhD student Ethan Jewett on the defense of his thesis, "Models, tools, and approaches for studying genetic and cultural variation." Ethan's thesis examines a series of problems on coalescent lineage distributions, with applications to the study of population growth and migration, inference of species trees, and genotype imputation. He also conducts analyses of variation in word usage, both in the United States and in Cape Verde, posing questions about cultural evolution. Ethan's work has been recognized with the Department of Biology's Samuel Karlin Prize in Mathematical Biology. Congratulations Ethan!

  • 7-22-2014 — Former postdoc Trevor Pemberton reports a study of population-genetic factors that affect worldwide variation in the inbreeding coefficent, showing that the value of this popular population-genetic statistic increases with increased consanguinty — but also with measures that reflect decreasing genetic diversity and increasing genetic isolation. The study is part of a special issue of Human Heredity on Consanguinity and Genomics.

  • 6-25-2014 — We congratulate co-mentored graduate student Dr. Naama Kopelman, on the completion of her PhD! Naama's thesis, conducted at Tel Aviv University on "The complex genealogy of Jewish populations," examines the genetic relationships of Jewish populations using both microsatellite loci [63] and genome-wide single nucleotide polymorphisms [114]. She also performs a theoretical investigation of the effect of admixture on tree-reconstruction algorithms, inspired by the placement of Jewish populations in a neighbor-joining tree [99]. Naama has begun a postdoc with Itay Mayrose, Department of Molecular Biology and Ecology of Plants, Tel Aviv University.

  • 6-22-2014 — A new special issue of Human Biology focuses on the genetics of Jewish populations. The lab contributes to two research studies in the special issue:
    • In a study of Y-chromosomal lineages in the Samaritans, Oefner et al. find that most Samaritans have a distinctive Y chromosome similar to that of Jewish Cohen lineages. Curiously, among the Samaritans, the only exception distant from the Cohen model haplotype is that of the Samaritan Cohen lineage.
    • An international team including graduate student Naama Kopelman studies genetic relationships with the Ashkenazi Jewish population in a large genome-wide data set, finding considerable shared ancestry with other Jewish populations and tracing more distant relationships to other populations of Europe and the Middle East.
    • Read the introduction to the special issue, by Noah Rosenberg and Steven Weitzman.

  • 6-5-2014 — A new paper by former postdoc Cuong Than determines the mean of the deep coalescence cost, measuring the fit of a gene tree to a species tree, under probability distributions for the shapes of gene trees and species trees. This paper extends Cuong's previous analysis focusing on the maximum deep coalescence cost rather than the mean [103]. The work advances knowledge of an important concept in estimation of species trees.

  • Past news items


    BFB Algee-Hewitt*, MD Edge*, J Kim, JZ Li, NA Rosenberg (2016) Individual identifiability predicts population identifiability in forensic microsatellite markers. Current Biology 26: 935-942. [Abstract] [PDF] [Supplement]

    F Disanto, NA Rosenberg (2015) Coalescent histories for lodgepole species trees. Journal of Computational Biology 22: 918-929. [Abstract] [PDF]

    NA Rosenberg, JTL Kang (2015) Genetic diversity and societally important disparities. Genetics 201: 1-12. [Abstract] [PDF] [Supplement]

    A Goldberg, P Verdu, NA Rosenberg (2014) Autosomal admixture levels are informative about sex bias in admixed populations. Genetics 198: 1209-1229. [Abstract] [PDF]

    M DeGiorgio, J Syring, AJ Eckert, AI Liston, R Cronn, DB Neale, NA Rosenberg (2014) An empirical evaluation of two-stage species tree inference strategies using a multilocus dataset from North American pines. BMC Evolutionary Biology 14: 67. [Abstract] [PDF] [Supplementary File 1 (.xlsx, accession numbers)] [Supplementary File 2 (.pdf, supplementary analyses)] [Supplementary File 3 (.zip, data)]

    M Jakobsson, MD Edge, NA Rosenberg (2013) The relationship between FST and the frequency of the most frequent allele. Genetics 193: 515-528. [Abstract] [PDF]

    JH Degnan, NA Rosenberg, T Stadler (2012) A characterization of the set of species trees that produce anomalous ranked gene trees. IEEE/ACM Transactions on Computational Biology and Bioinformatics 9: 1558-1568. [Abstract] [PDF]

    TJ Pemberton, D Absher, MW Feldman, RM Myers, NA Rosenberg, JZ Li (2012) Genomic patterns of homozygosity in worldwide human populations. American Journal of Human Genetics 91: 275-292. [Abstract] [PDF] [Main Supplement] [Supplementary Table 2 (.zip)] [Supplementary Table 3 (.zip)] [Supplementary Table 4 (.zip)] [Supplementary Table 5 (.zip)]

    S Ramachandran, NA Rosenberg (2011) A test of the influence of continental axes of orientation on patterns of human gene flow. American Journal of Physical Anthropology 146: 515-529. [Abstract] [PDF] [Supplementary Figure 1] [Supplementary Figure 2] [Supplementary Tables]

    ZA Szpiech, NA Rosenberg (2011) On the size distribution of private microsatellite alleles. Theoretical Population Biology 80: 100-113. [Abstract] [PDF]