What sort of knowledge is hidden in the genomes of African hunter-gatherers? I'm part of a team of researchers that answered this question, and our findings were recently published in the journal Cell. We examined the genomes of 15 African hunter-gatherers, discovering millions of never-before-seen genetic variants, evidence of sex between highly divergent ancient populations, and many parts of the human genome that have been shaped by local adaptation. In this blog, I share my perspective as the first author of the paper and reveal what it's like to study population genetics.
The story of this research actually began a decade ago, when Sarah Tishkoff (my postdoctoral advisor) and others collected DNA from hunter-gatherers living in Cameroon and Tanzania. This fieldwork involved a mix of bureaucratic red tape, hard work, some National Geographic-style exploring, and a little bit of adventure á la Indiana Jones. The Tishkoff Lab continues to send teams to Africa, and it's with a little bit of envy that I listen to lab mates recount tales of their travels. However, this envy is tempered with the knowledge that my skill set is better suited for the lab, and living in Philadelphia offers its own set of adventures.
My involvement in this project began in December 2010, when I joined the Tishkoff Lab at the University of Pennsylvania with a freshly minted Ph.D. in genetics. Upon arriving, I was handed six 2-TB hard drives containing the genomes of 15 African hunter-gatherers and told to have some fun. There's a split second when all you can think is, "Oh crap!" Then you collect yourself and let your curiosity take over. Nowhere is human genetic diversity as great as in Africa, and hunter-gatherers belong to some of the most interesting populations on Earth (including Pygmies from Cameroon and Hadza and Sandawe from Tanzania). To a population geneticist like me, this project was a dream come true. That doesn't mean it was all fun and games (computer crashes always seem to happen at the least convenient time, and more than one Friday night was spent responding to a flurry of emails bouncing between our lab and the labs of collaborators). Before getting to the interesting part of the analysis, we needed to assess the quality of our data. Thankfully, we used high-coverage (>60x) whole-genome sequencing. This means that we sequenced each position in the human genome 60 times, on average. Because of this redundancy we were able to have high confidence that the genetic variants we observed were real and not just the result of sequencing error. That said, there are parts of each genome where it is difficult to sequence. This so-called "dark matter of the human genome" includes highly repetitive DNA, often located in the tips and centers of chromosomes.
One of the great things about population genetics is that it allows us to infer human history by looking at DNA sequences. We do this by constructing mathematical models of evolution and using comparisons. These comparisons can involve different populations, individuals, or parts of the genome. When I compared different hunter-gatherer populations, I observed that Pygmies and the Sandawe have an excess of rare variants, and that the Hadza have a relative lack of rare variants. This is exactly what you would expect to see if Pygmy and Sandawe populations have grown in number and the Hadza have shrunk in number during recent history. When I compared different parts of hunter-gatherer genomes, I found something that was pretty cool: There is less genetic variation near genes. This pattern arises from natural selection acting on beneficial and/or harmful mutations.
When the Neanderthal genome was sequenced, one surprising finding was that all non-African populations appear to contain Neanderthal DNA. Could similar instances of interbreeding have occurred in Africa? Analyzing the genomes of African hunter-gatherers, we found evidence of ancient interbreeding with an unknown hominin population (where "hominin" refers to modern humans, extinct human species, and our immediate ancestors). Using a complicated set of comparisons, my collaborator Benjamin Vernot looked for regions of the genome where DNA from one parent is really different from what was received from the other parent. I wouldn't go so far as to say that we found evidence of a new species. However, there definitely is evidence that our history involves multiple cases where populations diverged only to exchange DNA many thousands of years later.
Comparisons can also be used to see if the strength of natural selection varies for different types of populations. This involves looking to see whether damaging variants are more common in one population than others. Surveying the entire genome, I found that the overall strength of natural selection appears to be similar for African hunter-gatherers and African populations that have agricultural or pastoral subsistence patterns. Evolution still occurs regardless of whether populations have adopted a modern lifestyle. However, when you zoom in on specific genes, numerous signatures of local adaptation can be found.
One particularly interesting signal of local adaptation in the Hadza involves the cannabis receptor gene CNR2. A quick Google search of the terms "Hadza" and "marijuana" shows why this happens to be such an intriguing find. The CNR2 gene is active in immune cells, and it helps modulate responses to inflammation. It's also worth noting that just because the genomic region near CNR2 is very different in the Hadza doesn't automatically mean that these genetic changes have any functional effect. Regardless, it's pretty interesting!
The Human Genome Project had a budget of $3 billion, compared to current sequencing costs that are lower than $10,000 per genome. As sequencing costs plummet, the deluge of sequence data will only increase. Although it will take decades before our knowledge of functional biology catches up to all this data, it's an exciting time to study genetics.