We are a mathematical, theoretical, and computational lab in genetics
and evolution. Research in the lab addresses problems in evolutionary
biology and human genetics through a combination of mathematical
modeling, computer simulations, development of statistical methods, and
inference from population-genetic data.
Uricchio has reported an upper
bound on the size of gene tree sets required before all splits of a
species tree appear in a gene tree set with a specified probability. His
upper bound depends on a single parameter the shortest internal
branch in the species tree. The computation extends the lab's work on
methods for species tree inference from gene trees.
10-14-2016 Recent PhD
graduate Doc Edge has
devised a general mathematical model to understand how genotypic
differences between populations contribute to phenotypic differences
between populations. He uses the model to analyze the relationship of
genetics to "health disparities," concluding that health disparities
that all trend in the same direction are incompatible with neutral
genetic explanations. The work extends a simpler model of Doc's
, allowing for
diploidy, genetic drift, and general distributions of allele
10-7-2016 Postdoc Filippo Disanto continues the
lab's work on coalescent histories with
a study of the number
of coalescent histories for matching gene trees in caterpillar-like
families of species trees. Filippo's work solves an open problem from
earlier work in the lab
, showing that the
number of coalescent histories is asymptotic to a constant multiple of
the Catalan numbers. He uses clever iterative enumerations and
techniques of analytic combinatorics to obtain the result. See also
 for related
7-27-2016 We are pleased to announce that the
software MONOPHYLER is now available.
MONOPHYLER computes probabilities
that sets of lineages are monophyletic, both for general species trees and
for trees of small size. MONOPHYLER
is reported by PhD student Rohan Mehta. The software encodes
formulas from Rohan's
recent Proceedings of the National
Academy of Sciences paper.
7-22-2016 We congratulate PhD
student Doc Edge on his thesis
defense, "Pick up the pieces: combining information from multiple
genetic loci." Doc's thesis examines several problems in the
mathematical modeling of the genotype-to-phenotype relationship in
structured populations, mathematical properties of
the Fst measure of genetic differentiation, and
population-genetic aspects of forensic DNA testing and genetic association
studies. Doc has been recognized with the
Samuel Karlin Prize in Mathematical Biology, awarded by the Department of
Biology. Congratulations Doc!
7-19-2016 PhD student Rohan Mehta reports a
computation of the probability that a set of gene lineages on an
arbitrary species tree. The work generalizes earlier studies from the lab
that considered trees of
or three species. Rohan
illustrates the new formula with an application in maize.
The study is a contribution to
the Comparative Phylogeography volume of the "In the Light of Evolution"
special issue series of Proceedings of the National Academy of
6-27-2016 We congratulate biology MS student Brian
Donovan on the completion of his PhD in science education "An
experimental exploration of how text-based instruction in school biology
affects belief in genetic essentialism of race in adolescent
populations." Brian defended his PhD in the Graduate School of
Education on May 26. He is continuing his studies as a postdoctoral fellow
at the Biological Sciences Curriculum Study in Colorado Springs.
6-17-2016 The lab reports
a study examining
the predicted distribution of gene tree shape under a birth-death
model of species divergence. The work suggests that gene trees are
expected to be more imbalanced than species trees, potentially
providing part of the explanation for an excess of imbalance
observed in inferred phylogenies.
Goldberg and Jaehee Kim, who have received fellowships for
2016-2017 from the Stanford Center for Computational, Evolutionary, and
DeGiorgio reports on the consistency properties of species
tree inference methods in a model with ancestral population
structure. By introducing a model that includes population
subdivision in ancestral species,
introduces a new direction for studying consistency in species
tree inference. The work is related to several recent papers
from the lab on consistency of species tree methods
4-22-2016 Several projects from the lab have been in
4-5-2016 We congratulate PhD
student Amy Goldberg
on the publication of
her Nature article
entitled "Post-invasion demography of prehistoric humans in South
America." In this work, Amy and her colleagues use the locations and
dates of South American archaeological sites to estimate the time
trajectory of the human population size history of the
continent. Read the news
Edge, and Jaehee Kim report that forensic genetic
markers selected for their use in individual identification possess
a surprising level of information about genetic ancestry. Moreover, their
study finds that a general correlation holds for genetic
markers between their information about individual identity and
ancestry information. The result makes use of theory from the lab on the
connection between measures of genetic diversity and genetic
1-5-2016 The lab helps celebrate the centennial of the
student Amy Goldberg
develops a model for sex-biased admixture on the X-chromosome, a
curious mathematical sequence leads to an unexpected connection deep
in the Genetics archive.
Read about the oscillatory functions and
coupled recursions encountered in this scholarly adventure with a
surprise appearance of the Fibonacci numbers.
10-7-2015 PhD student Jonathan Kang has
analyzed a new approach for prioritizing individuals for whole-genome
sequencing. This approach, based on minimizing a quantity
the average distance to the closest leaf, seeks to identify a
set of samples that will provide optimal templates for imputing
genotypes in additional individuals. He compares the method to an
earlier algorithm, also from the lab: maximizing phylogenetic
Jonathan's article has been selected
for Genetics issue
9-30-2015 Postdoc Filippo Disanto reports a
study of the
number of coalescent histories for gene trees and species trees in
the lodgepole family. He uses connections with other
combinatorial structures from theoretical computer science to derive
exact results in the context of a new problem arising from
biology. The term "lodgepole" for the tree shape he considers is based
on a resemblance to the pattern in which lodgepole pine needles branch
off the main twig. The work follows earlier studies from the lab on
coalescent histories (,
9-24-2015 We report
an article on detecting
selective sweeps using a new statistic, the haplotype allele frequency
(HAF) score. This statistic tabulates the frequencies of alleles on a
haplotype, and it has distinctive patterns of change during a selective
sweep. The approach is related to previous articles from the lab that
examined haplotype properties for detecting a deviation from null
population-genetic models (
The lab reports two articles in this month's issue of Genetics.
note reporting the
software CLUMPAK is now
available. What does the program do? It clumps and packages results from
Structure and related programs. What does it produce? A pack of
Distruct plots a
clumpak! CLUMPAK is reported by PhD
graduate Naama Kopelman. Former
Jakobsson contributed to the
student Doc Edge uses a
mathematical model to interpret the implications of two computations
in population genetics---the partition of genetic variance, and the
genetic assignment of individual ancestry---for human phenoypic
differentiation. He concludes that a typical selectively neutral
quantitative phenotype is comparable to a single genetic locus in
terms of its ancestry information.
The study is part of a
special issue of Studies in History and Philosophy of Biological
and Biomedical Sciences on Genomics and Philosophy of Race.
a study of the
effect of sex-biased admixture on the X chromosome. The study has a
number of surprises: (1) The admixture level on the X chromosome is not
simply a 2/3-and-1/3 linear combination of female and male
parameters. (2) A difference in X chromosomal and autosomal levels of
admixture need not imply male bias entering the admixed population from
one source and female bias from a second source: the bias can be in the
same direction in both source populations, but with different
magnitudes. (3) A third surprise involves the appearance of a sequence
related to the Fibonacci numbers! The paper follows two previous
articles from the lab on mechanistic models of admixture
- We report a review
of three cases in which differences in levels of genetic diversity
across populations contribute to population differences in societal
variables related to forensic testing, transplantation matching,
and genome-wide association. The study also considers a fourth scenario,
performing a reanalysis that contests a claim that within-population
genetic diversity has influenced global economic development. PhD
student Jonathan Kang contributed to the project.
to Genomes blog post from the Genetics Society of America]
5-22-2015 We congratulate three lab members who have
recently been awarded competitive fellowships!
5-15-2015 PhD student Nandita Garud
from Dmitri Petrov's lab next
door reports on the
mathematical properties of statistics used in detecting soft selective
sweeps. Nandita provides an improvement to the use of proposed
and H2/H1, applying the
modified statistics to analyze selection in Drosophila. The work relies on
the lab's mathematical analysis of homozygosity and the frequency of
the most frequent allele
We are pleased to congratulate the lab's Administrative Associate Elena
Yujuico on receiving the Humanities & Sciences Dean's Award
of Merit! This award recognizes staff members who make outstanding
contributions in the School of Humanities & Sciences.
- Nicolas Alcala Swiss National Science Foundation Early
Postdoc.Mobility Fellowship (2015-2017).
- Amy Goldberg Achievement Rewards for College Scientists Fellowship from
the Northern California Chapter of the ARCS Foundation (2015-2016).
- Lawrence Uricchio Stanford Center for Computational,
Evolutionary, and Human Genomics Postdoctoral Fellowship (2015-2016).
story in the Stanford Medicine SCOPE blog discusses with
Noah the journal Theoretical Population Biology, for which he
serves as the Editor-in-Chief.
new study by Nicole
Creanza et al. performs the largest joint analysis of genetic
variation and phonemic variation in populations worldwide. The study
uncovers a number of new coevolutionary patterns in genes and languages,
including correspondences in spatial axes of genetic and linguistic diversity,
and a difference for genes and languages in the effects of population
Pemberton contributed to the project.
commentary by Keith Hunley]
12-10-2014 Filippo Disanto reports a
about anomalous ranked gene trees (ARGTs), demonstrating that as the
number of species increases, the fraction of ranked species trees that
produces ARGTs approaches 1. The work extends earlier existence
results on ARGTs
12-3-2014 Approximate Bayesian Computation (ABC)
provides a way of performing statistical inference from complex
models that can be simulated but for which likelihoods are difficult
to evaluate. Former postdoc Erkan
Buzbas reports a new advance in ABC techniques for scenarios
in which even simulating from the model is challenging
Approximate Approximate Bayesian Computation. Read
about it here!
Goldberg reports a surprising result, that properties of
admixture obtained from autosomal loci alone can be
informative about sex bias in the history of admixture. The result is
obtained in a
new article in the
November 2014 issue of Genetics. It builds on an
studied by former postdoc Paul Verdu, who is also a contributor
to the project.
10-1-2014 Nicolas Alcala joins us as a new
postdoc. Nicolas completed his PhD in ecology and evolution at the
University of Lausanne, performing several studies in the
population-genetic modeling of demography and population structure. We
are pleased to welcome Nicolas to the group!
student Doc Edge reports
a new paper on the mathematical properties of population-genetic
statistic FST. Doc has refined the bounds
on FST as functions of the frequency of the most
frequent allele and homozygosity obtained in an
earlier study from the
lab, considering a finitely-many-alleles case instead of the less
constrained infinitely-many-alleles case. The work extends the lab's
line of work on mathematical properties
of population-genetic statistics.
8-28-2014 We welcome Ilana Arbisser as a PhD
student in the lab. Ilana completed her BA at the University of
Pennsylvania, where she majored in biology with a concentration in
biological mathematics. Ilana rotated through the lab during the
spring quarter, working on problems in coalescent theory. Welcome Ilana!
new study in PLoS
Genetics led by former postdoc Paul Verdu reports on
admixture in Native American and First Nation populations of the
Pacific Northwest. The study describes recent European admixture in
coastal and inland populations from British Columbia and Alaska, also
uncovering evidence of recent East Asian admixture in the inland
groups. It is the first genomic investigation focused on the Pacific
Northwest region. Former
Pemberton was a contributor to the project.
8-8-2014 We congratulate PhD student Ethan Jewett on
the defense of his thesis, "Models, tools, and approaches for studying
genetic and cultural variation." Ethan's thesis examines a series of
problems on coalescent lineage distributions, with applications to the
study of population growth and migration, inference of species trees, and
genotype imputation. He also conducts analyses of variation in word usage,
both in the United States and in Cape Verde, posing questions about
cultural evolution. Ethan's work has been recognized with the Department
of Biology's Samuel Karlin Prize in Mathematical Biology. Congratulations
Former postdoc Trevor
a study of
population-genetic factors that affect worldwide variation in the
inbreeding coefficent, showing that the value of this popular
population-genetic statistic increases with increased consanguinty
but also with measures that reflect decreasing genetic diversity
and increasing genetic isolation. The study is part of a special issue
of Human Heredity on Consanguinity and Genomics.
6-25-2014 We congratulate co-mentored graduate student
Dr. Naama Kopelman, on the completion of her PhD! Naama's thesis,
conducted at Tel Aviv University on "The complex genealogy of Jewish
populations," examines the genetic relationships of Jewish
populations using both microsatellite loci
 and genome-wide single
nucleotide polymorphisms . She also
performs a theoretical investigation of the effect of admixture on
tree-reconstruction algorithms, inspired by the placement of
Jewish populations in a neighbor-joining tree
. Naama has begun a
postdoc with Itay Mayrose, Department of Molecular Biology and
Ecology of Plants, Tel Aviv University.
6-22-2014 A new special issue of Human
Biology focuses on the genetics of Jewish populations. The lab
contributes to two research studies in the special issue:
new paper by former postdoc
determines the mean of the deep coalescence cost, measuring the fit of a
gene tree to a species tree, under probability distributions for the
shapes of gene trees and species trees. This paper extends Cuong's
previous analysis focusing on the maximum deep coalescence cost rather
than the mean . The
work advances knowledge of an important concept in estimation of species
Past news items
- In a study of
Y-chromosomal lineages in the Samaritans, Oefner et al. find
that most Samaritans have a distinctive Y chromosome similar to that
of Jewish Cohen lineages. Curiously, among the Samaritans, the only
exception distant from the Cohen model haplotype is that of the
Samaritan Cohen lineage.
- An international team including graduate student Naama Kopelman
studies genetic relationships with the Ashkenazi Jewish population in a
large genome-wide data
set, finding considerable
shared ancestry with other Jewish populations and tracing more distant
relationships to other populations of Europe and the Middle East.
to the special issue, by Noah
Rosenberg and Steven Weitzman.
SELECTED RECENT PUBLICATIONS
BFB Algee-Hewitt*, MD Edge*, J Kim, JZ Li, NA
Rosenberg (2016) Individual identifiability predicts population
identifiability in forensic microsatellite markers. Current
Biology 26: 935-942.
F Disanto, NA Rosenberg (2015) Coalescent histories for
lodgepole species trees. Journal of Computational Biology 22:
NA Rosenberg, JTL Kang (2015) Genetic diversity and
societally important disparities. Genetics 201: 1-12.
A Goldberg, P Verdu, NA Rosenberg (2014) Autosomal
admixture levels are informative about sex bias in admixed populations.
Genetics 198: 1209-1229.
M DeGiorgio, J Syring, AJ Eckert, AI Liston, R Cronn, DB
Neale, NA Rosenberg (2014) An empirical evaluation of two-stage
species tree inference strategies using a multilocus dataset from North
American pines. BMC Evolutionary Biology 14: 67.
File 1 (.xlsx, accession numbers)]
File 2 (.pdf, supplementary analyses)]
File 3 (.zip, data)]
M Jakobsson, MD Edge, NA Rosenberg (2013) The
relationship between FST and the frequency of the
most frequent allele.
Genetics 193: 515-528.
JH Degnan, NA Rosenberg, T Stadler (2012) A
of the set of species trees that produce anomalous ranked gene trees.
IEEE/ACM Transactions on Computational Biology and Bioinformatics
TJ Pemberton, D Absher, MW Feldman, RM Myers, NA
Rosenberg, JZ Li (2012) Genomic patterns of homozygosity in worldwide
human populations. American Journal of Human Genetics 91:
Table 2 (.zip)]
Table 3 (.zip)]
Table 4 (.zip)]
Table 5 (.zip)]
S Ramachandran, NA Rosenberg (2011) A test of the influence
of continental axes of orientation on patterns of human gene flow.
American Journal of Physical Anthropology 146: 515-529.
ZA Szpiech, NA Rosenberg (2011) On the size
distribution of private microsatellite alleles. Theoretical
Population Biology 80: 100-113.