Thursday 23 January 2020

Unravelling arthropod genomic diversity over 500 million years of evolution

An international team of scientists report in the journal Genome Biology results from a pilot project, co-led by Robert Waterhouse, Group Leader at the SIB Swiss Institute of Bioinformatics and University of Lausanne, to kick-start the global sequencing initiative of thousands of arthropods. Comparative analyses across 76 species spanning 500 million years of evolution reveal dynamic genomic changes that point to key factors behind their success and open up many new areas of research.

The i5k pilot project sequenced, assembled, and annotated the genomes of 28 diverse arthropod species,
substantially increasing the current species sampling to explore arthropod genomic diversity over
500 million years of evolution
CC BY 4.0Milkweed Bug by Chiaki UedaLong-Horned Beetle by Robert Mitchell]
Friends and foes, arthropods rule the world

Arthropods make up the most species-rich and diverse group of animals on Earth, with numerous adaptations over 500 million years of evolution that have allowed them to exploit all major ecosystems. They play vital roles in the healthy ecology of our planet as well as being both beneficial and detrimental to the success of humankind through pollination and biowaste recycling, or destroying crops and spreading disease.

"By sequencing and comparing their genomes we can begin to identify some of the key genetic factors behind their evolutionary success," explains Waterhouse, "but will the impact of human activities in modern times bring an end to their rule, or will their ability to adapt and innovate ensure their survival?"

The i5k pilot project: kick-starting arthropod genome sequencing

The i5k initiative to sequence and annotate the genomes of 5000 species of insects and other arthropods, was launched in a letter to Science in 2011. From the outset, the initiative aimed to support the development of new genomic resources for understanding the molecular biology and evolution of arthropods.

Since then, the i5k has grown into a broad community of scientists using genomics to study insects and other arthropods in many different contexts from fundamental animal biology, to effects on ecology and the environment, and impacts on human health and agriculture.

To kick-start the i5k, a pilot project was launched at the Baylor College of Medicine led by Stephen Richards to sequence, assemble, and annotate the genomes of 28 diverse arthropod species carefully selected from 787 community nominations.

Large-scale multi-species genome comparisons

"The identification and annotation of thousands of genes from the i5k pilot project substantially increases our current genomic sampling of arthropods," says Waterhouse.

The evolutionary innovations of insects and other arthropods are as numerous as they are wondrous, from terrifying fangs 
and stingers to exquisitely coloured wings and ingenious feats of engineering. DNA sequencing allows us to chart the 
genomic blueprints underlying this incredible diversity that characterises the arthropods and makes them the most 
successful group of animals on Earth. An international team of scientists report in the journal Genome Biology results 
from a pilot project, co-led by SIB Group Leader Robert Waterhouse at the University of Lausanne, to kick-start the
 global sequencing initiative of thousands of arthropods. Comparative analyses across 76 species spanning 
500 million years of evolution reveal dynamic genomic changes that point to key factors behind 
their success and open up many new areas of research [Credit: Robert Waterhouse]

Combining these with previously sequenced genomes enabled the researchers to perform a large-scale comparative analysis across 76 diverse species including flies, butterflies, moths, beetles, bees, ants, wasps, true bugs, thrips, lice, cockroaches, termites, mayflies, dragonflies, damselflies, bristletails, crustaceans, centipedes, spiders, ticks, mites, and scorpions.

PhD students Gregg Thomas from Indiana University, USA, and Elias Dohmen from the University of Munster, Germany, used the annotated genomes to perform the computational evolutionary analyses of more than one million arthropod genes.

Dynamic gene family evolution - a key to success?

The team's analyses focused on tracing gene evolutionary histories to estimate changes in gene content and gene structure over 500 million years. This enabled identification of families of genes that have substantially increased or decreased in size, or newly emerged or disappeared, or rearranged their protein domains, between and within each of the major arthropod subgroups.

The gene families found to be most dynamically changing encode proteins involved in functions linked to digestion, chemical defence, and the building and remodelling of chitin - a major part of arthropod exoskeletons.

Adaptability of digestive processes and mechanisms to neutralise harmful chemicals undoubtedly served arthropods well as they conquered a wide variety of ecological niches. Perhaps even more importantly, the flexibility that comes with a segmented body plan and a dynamically remodellable exoskeleton allowed them to thrive by physically adapting to new ecosystems.

Innovation through invention and repurposing

Newly evolved gene families also reflect functions known to be important in different arthropod groups, such as visual learning and behaviour, pheromone and odorant detection, neuronal activity, and wing development. These may enhance food location abilities or fine-tune species self-recognition and communication.

In contrast, few changes were identified in the ancestor of insects that undergo complete metamorphosis: the dramatic change from the juvenile form to the fully developed adult (like a caterpillar transforming into a butterfly). This has traditionally been thought of as a major step in the evolution of insects from the original state of developing through gradual nymph stages until finally reaching the adult stage.

"These findings support the idea that this key transition is more likely to have occurred through the rewiring of existing gene networks or building new networks using existing genes, a scenario of new-tricks-for-old-genes" explains Waterhouse.

Genomic insights into arthropod biology and evolution

Several detailed genomic studies of individual i5k species have focused on their fascinating biological traits such as the feeding ecology and developmental biology of the milkweed bug, insecticide resistance, blood feeding, and traumatic sex of the bed bug, horizontal gene transfer from bacteria and fungi and digestion of plant materials by the Asian long-horned beetle, and parasite-host interactions and potential vaccines for the sheep blowfly. The combined analyses reveal dynamically changing and newly emerged gene families that will stimulate new areas of research.

"We can take these hypotheses into the lab and use them to directly study how the genome is translated into visible morphology at a resolution that cannot be achieved with any other animal group," says co-lead author, Ariel Chipman, from the Hebrew University of Jerusalem, Israel.

The new resources substantially advance progress towards building a comprehensive genomic catalogue of life on our planet, and with more than a million described arthropod species and estimates of seven times as many, there clearly remains a great deal to discover!

Next steps in arthropod genomics and beyond

More effective and cost-efficient DNA sequencing technologies mean that new ambitious initiatives are already underway to sequence the genomes of additional arthropods. These include the Global Ant Genome Alliance and the Global Invertebrate Genomics Alliance, as well as the Darwin Tree of Life Project that is targeting all known species of animals in the British Isles, and the global network of communities coordinated by the Earth BioGenome Project (EBP) that aims to sequence all of Earth's eukaryotic biodiversity7.

The EBP's goals also include benefitting human welfare, where the roles of arthropods are clear and the hidden benefits are likely to be substantial, as well as protecting biodiversity and understanding ecosystems, where alarming reports of declining numbers make arthropods a priority.

"The completion of the i5k pilot project therefore represents an important milestone in the progress towards intensifying efforts to develop a comprehensive genomic catalogue of life on our planet", concludes Richards.

Source: Swiss Institute of Bioinformatics [January 23, 2020]

Wednesday 22 January 2020

First ancient DNA from West/Central Africa illuminates deep human past

An international team led by Harvard Medical School scientists has produced the first genome-wide ancient human DNA sequences from west and central Africa.

General view of the excavation of Shum Laka's rockshelter (Grassfields region of Cameroon).
human population that lived in the region for at least five millennia and bore little genetic relatedness to the people
who live in the region today. Analysis of whole genome ancient DNA data from the people who lived at this site
provided insights into the existence of several never-before-appreciated, early-branching
African human lineages [Credit: Pierre de Maret, January 1994]
The data, recovered from four individuals buried at an iconic archaeological site in Cameroon between 3,000 and 8,000 years ago, enhance our understanding of the deep ancestral relationships among populations in sub-Saharan Africa, which remains the region of greatest human diversity today.

The findings, published in Nature, provide new clues in the search to identify the populations that first spoke and spread Bantu languages. The work also illuminates previously unknown "ghost" populations that contributed small portions of DNA to present-day African groups.

Map of Africa with Cameroon in dark blue and approximate location of Shum Laka marked with star. Image adapted from Alvaro1984 18/Wikimedia Commons

Research highlights:

- DNA came from the remains of two pairs of children who lived around 3,000 years ago and 8,000 years ago, respectively, during the transition from the Stone Age to the Iron Age.

- The children were buried at Shum Laka, a rock shelter in the Grassfields region of northwestern Cameroon where ancient people lived for tens of thousands of years. The site has yielded prolific artifacts along with 18 human skeletons and lies in the region where researchers suspect Bantu languages and cultures originated. The spread of Bantu languages--and the groups that spoke them--over the past 4,000 years is thought to explain why the majority of people from central, eastern and southern Africa are closely related to one another and to west/central Africans.

- Surprisingly, all four individuals are most closely related to present-day central African hunter-gatherers, who have very different ancestry from most Bantu speakers. This suggests that present-day Bantu speakers in western Cameroon and across Africa did not descend from the sequenced children's population.

Excavation of a double burial at the Shum Laka rock shelter (Grassfields region of Cameroon) containing the remains
of two boys who lived ~8,000 years ago and who were genetically from the same family. Ancient DNA reveals that
these two individuals and another pair of children buried five millennia later at Shum Laka were from a stable
 population that was then almost completely displaced by the very different populations living
in Cameroon today [Credit: Isabelle Ribot, January 1994]

- One individual's genome includes the earliest-diverging Y chromosome type, found almost nowhere outside western Cameroon today. The findings show that this oldest lineage of modern human males has been present in that region for more than 8,000 years, and perhaps much longer.

- Genetic analyses indicate that there were at least four major lineages deep in human history, between 200,000 and 300,000 years ago. This radiation hadn't been identified previously from genetic data.

- Contrary to common models, the data suggest that central African hunter-gatherers diverged from other African populations around the same time as southern African hunter-gatherers did.

- Analyses reveal another set of four branching human lineages between 60,000 and 80,000 years ago, including the lineage known to have given rise to all present-day non-Africans.

- The Shum Laka individuals themselves harbor ancestry from multiple deep lineages, including a previously unknown, early-diverging ancestry source in West Africa.

Source: Harvard Medical School [January 22, 2020]

Study reveals pre-Hispanic history, genetic changes among indigenous Mexican populations

As more and more large-scale human genome sequencing projects get completed, scientists have been able to trace with increasing confidence both the geographical movements and underlying genetic variation of human populations. Most of these projects have favoured the study of European populations, and thus, have been lacking in representing the true ethnic diversity across the globe.

To better understand the broad demographic history of pre-Hispanic Mexico and to search for signatures of adaptive
evolution, an international team led by Mexican scientists have sequenced the complete protein-coding regions
of the genome, or exomes, of 78 individuals from different indigenous groups from Mexico. The genomic study
 is the largest of its kind for indigenous populations from the Americas [Credit: Ruben Mendoza,
National Laboratory of Genomics for Biodiversity (LANGEBIO) - UGA, CINVESTAV]
To better understand the broad demographic history of pre-Hispanic Mexico and to search for signatures of adaptive evolution, an international team led by Mexican scientists have sequenced the complete protein-coding regions of the genome, or exomes, of 78 individuals from five different indigenous groups from Northern (Rara?muri or Tarahumara, and Huichol), Central (Nahua), South (Triqui, or TRQ) and Southeast (Maya, or MYA) Mexico. The genomic study, the largest of its kind for indigenous populations from the Americas, appeared recently in the advanced online edition of Molecular Biology and Evolution.

"We modeled the demographic history of indigenous populations from Mexico with northern and southern ethnic groups (Tarahumara and Huichol) splitting 7.2 kya and subsequently diverging locally 6.5 kya (Huichol groups) and 5.7 kya (Triqui and Maya), respectively," said lead author Maria Avila-Arcos, of the National Autonomous University of Mexico. The Nahua were excluded from the final analysis due to the noise it brought to the overall analysis.

Overall, they identified 120,735 single nucleotide variants (SNV) among the individuals studied, which were used to trace back the population history. Furthermore, they were able to reconcile their data with the demographic history and fossil records of ancestral Native Americans.

"The split times we found are also coherent with previous estimates of ancestral Native Americans diverging ~17.5-14.6 KYA into Southern Native Americans or "Ancestral A," comprising Central and Southern Native Americans) and Northern Native Americans or "Ancestral B," and with an initial settlement of Mexico occurring at least 12,000 years ago, as suggested by the earliest skeletal remains dated to approximately this age found in Central Mexico and the Yucatan peninsula," said Avila-Arcos. "Studies on genome-wide data from ancient remains from Central and South America reveal genetic continuity between ancient and modern populations in some parts of the Americas over the last 8,500 years."

"This suggests that, by that time, the ancestral population of MYA was not yet genetically differentiated from others, so our estimates of northern/southern split at 7.2 KYA and Mayan/Triqui divergence at 5.7 KYA fit with this scenario."

Next, they scanned the data to identify candidate genes most important for adaptation.

"Interestingly, some of these genes had previously been identified as targets of selection in other populations," said co-corresponding author Andres Moreno Estrada, principal investigator at National Laboratory of Genomics for Biodiversity (LANGEBIO) - UGA, CINVESTAV.

These genes include SLC24A5, involved in skin pigmentation, and FAP, which was previously suggested to be under adaptive archaic introgression in Peruvians and Melanesians. Three genes were involved in the immune response. These include SYT5, implicated in innate immune response, and interleukins IL17A and IL13. The remaining candidate genes were involved in signal transduction (MPZL1), protein localization and transport (GRASP and ARFRP1), cell differentiation and spermatogenesis (GMCL), Golgi apparatus organization (UBXN2B), neuron differentiation (MANF), signaling and cardiac muscle contraction (ADRBK1), cell cycle (CDK5), microtubule organization and stabilization (NCKAP5L), and stress fiber formation (NCKIPSD).

A couple of genes stood out for the team. These included, BCL2L13, which is highly expressed in skeletal muscle and could be related to physical endurance, including high endurance long-distance running, a well-known trait of the northern Mexico Rara?muri. The KBTBD8 gene has been associated with idiopathic short stature (also found in Koreans) and the team found it to be highly differentiated in Triqui, a southern indigenous group from Oaxaca whose height is extremely low compared to other Native populations.

"We carried out the most comprehensive characterization of potentially adaptive functional variation in Indigenous peoples from the Americas to date," said Moreno Estrada. "We identified in these populations over four thousand new variants, most of them singletons, with neutral, regulatory, as well as protein-truncating and missense annotations. The average number of singletons per individual was higher in Nahua (NAH) and Maya (MYA), which is expected given these two Indigenous groups embody the descendants of the largest civilizations in Mesoamerica, and that today Nahua and Maya languages are the most spoken Indigenous languages in Mexico. Furthermore, the generated data also allowed us to propose a demographic model inferred from genomic data in Native Mexicans and to identify possible events of adaptive evolution in pre-Columbian Mexico."

Source: Oxford University Press [January 22, 2020]

Life's Frankenstein beginnings

When the Earth was born, it was a mess. Meteors and lightning storms likely bombarded the planet's surface where nothing except lifeless chemicals could survive. How life formed in this chemical mayhem is a mystery billions of years old. Now, a new study offers evidence that the first building blocks may have matched their environment, starting out messier than previously thought.

Szostak believes the earliest cells developed on land in ponds or pools, potentially in volcanically active regions.
Ultraviolet light, lightning strikes, and volcanic eruptions all could have helped spark the chemical
reactions necessary for life formation [Credit: Don Kawahigashi/Unsplash]
Life is built with three major components: RNA and DNA--the genetic code that, like construction managers, program how to run and reproduce cells--and proteins, the workers that carry out their instructions. Most likely, the first cells had all three pieces. Over time, they grew and replicated, competing in Darwin's game to create the diversity of life today: bacteria, fungi, wolves, whales and humans.

But first, RNA, DNA or proteins had to form without their partners. One common theory, known as the "RNA World" hypothesis, proposes that because RNA, unlike DNA, can self-replicate, that molecule may have come first. While recent studies discovered how the molecule's nucleotides--the A, C, G and U that form its backbone--could have formed from chemicals available on early Earth, some scientists believe the process may not have been such a straightforward path.

"Years ago, the naive idea that pools of pure concentrated ribonucleotides might be present on the primitive Earth was mocked by Leslie Orgel as 'the Molecular Biologist's Dream,'" said Jack Szostak, a Nobel Prize Laureate, professor of chemistry and chemical biology and genetics at Harvard University, and an investigator at the Howard Hughes Medical Institute. "But how relatively modern homogeneous RNA could emerge from a heterogeneous mixture of different starting materials was unknown."

In a paper published in the Journal of the American Chemical Society, Szostak and colleagues present a new model for how RNA could have emerged. Instead of a clean path, he and his team propose a Frankenstein-like beginning, with RNA growing out of a mixture of nucleotides with similar chemical structures: arabino- deoxy- and ribonucleotides (ANA, DNA, and RNA).

In the Earth's chemical melting pot, it's unlikely that a perfect version of RNA formed automatically. It's far more likely that many versions of nucleotides merged to form patchwork molecules with bits of both modern RNA and DNA, as well as largely defunct genetic molecules, such as ANA. These chimeras, like the monstrous hybrid lion, eagle and serpent creatures of Greek mythology, may have been the first steps toward today's RNA and DNA.

"Modern biology relies on relatively homogeneous building blocks to encode genetic information," said Seohyun Kim, a postdoctoral researcher in chemistry and first author on the paper. So, if Szostak and Kim are right and Frankenstein molecules came first, why did they evolve to homogeneous RNA?

Kim put them to the test: He pitted potential primordial hybrids against modern RNA, manually copying the chimeras to imitate the process of RNA replication. Pure RNA, he found, is just better--more efficient, more precise, and faster--than its heterogeneous counterparts. In another surprising discovery, Kim found that the chimeric oligonucleotides--like ANA and DNA--could have helped RNA evolve the ability to copy itself. "Intriguingly," he said, "some of these variant ribonucleotides have been shown to be compatible with or even beneficial for the copying of RNA templates."

If the more efficient early version of RNA reproduced faster than its hybrid counterparts then, over time, it would out-populate its competitors. That's what the Szostak team theorizes happened in the primordial soup: Hybrids grew into modern RNA and DNA, which then outpaced their ancestors and, eventually, took over.

"No primordial pool of pure building blocks was needed," Szostak said. "The intrinsic chemistry of RNA copying chemistry would result, over time, in the synthesis of increasingly homogeneous bits of RNA. The reason for this, as Seohyun has so clearly shown, is that when different kinds of nucleotides compete for the copying of a template strand, it is the RNA nucleotides that always win, and it is RNA that gets synthesized, not any of the related kinds of nucleic acids."

So far, the team has tested only a fraction of the possible variant nucleotides available on early Earth. So, like those first bits of messy RNA, their work has only just begun.

Source: Harvard University [January 22, 2020]

Domesticated wheat has complex parentage

Certain types of domesticated wheat have complicated origins, with genetic contributions from wild and cultivated wheat populations on opposite sides of the Fertile Crescent. Terence Brown and colleagues at the University of Manchester report these findings in a new paper published in the open-access journal PLOS ONE.

Credit: WikiCommons
A wild form of wheat called emmer wheat was one of the first plant species that humans domesticated. Emmer is not grown widely today, but gave rise to the durum wheat used for pasta and hybridized with another grass to make bread wheat, so its domestication was an important step in the transition from hunting and gathering to agriculture.

While the archaeological record suggests that cultivation began in the southern Levant region bordering the eastern edge of the Mediterranean Sea around 9,500 years ago, genetic studies point to an origin in the northern region of the Fertile Crescent, in what is now Turkey. To clarify emmer's origins, researchers screened 189 types of wild and domesticated wheats and used the more that 1 million genetic variations that they identified to piece together the genetic relationships between different kinds of wheat.

Based on the analysis, the researchers propose that an emmer crop, which humans cultivated but had not yet domesticated, spread from the southern Levant to southeast Turkey, where it mixed with a wild emmer population and ultimately yielded the first domesticated variety. The results of this hybridization can be detected in wild emmer plants in Turkey today.

The complex evolutionary relationships between wild emmer and cultivated wheat varieties uncovered by the analysis are similar to the interbreeding that occurred between wild and cultivated populations of other grain crops, such as barley and rice.

The authors add: "We used next-generation DNA sequencing technologies to detect hundreds of thousands of variants in the genomes of wild and cultivated emmer wheat, giving us an unprecedented insight into the complexity of its domestication process. The patterns we observed do not fit well with a simplistic model of fast and localized domestication event but suggest instead a long process of cultivation of wild wheat by hunter-gatherer communities connected throughout the Fertile Crescent, prior to the emergence of a fully domesticated wheat form."

Source: Public Library of Science [January 22, 2020]