Monday, May 30, 2016

Metabarcoding Phytophtora

The increasing international trade in rooted plants and the continual introduction of new varieties and/or species exposes nurseries, with particular emphasis on the potted ornamentals, to new host−pathogen combinations and creates new disease threats. Invasive pathogens have been frequently found on ornamental plants and their trade is considered a primary driver of new disease outbreaks since it causes large-scale distribution of pathogens beyond their natural endemic ranges with severe socio-economic impact

The water mold genus Phytophthora contains 140 species that are pathogens of dicotyledons. They are plant pathogens of considerable economic importance capable of causing enormous economic losses on crops worldwide, as well as environmental damage in natural ecosystems. 

Both the identification of known and the discovery of new Phytophthora speces is difficult due to the limits of conventional culturing and bating methods. As a consequence, several invasive and previously unknown species have been identified only when it was too late and they already caused severe damage in nonnative environments.

In a recent study, Phytophthora diversity was analyzed in potted ornamental plant nurseries using genus specific primers combined with the Sanger sequencing of cloned amplicons. The analyses highlighted a complex assemblage of Phytophthora species with new host−pathogen combinations, evidence of species previously unreported in the investigated area, and phylotypes representative of species that remain to be taxonomically defined

A new study now investigated the power and reliability of a metabarcoding approach based on genus-specific primers and the 454 platform as it is considered more powerful and less costly which is needed if one is looking at broad testing. The colleagues used soil and root samples from potted ornamental nurseries, already analyzed as part of the former study. 

The study indicates that the use of genus-specific primers combined with 454 is a very useful approach to investigate Phytophthora diversity in different ecosystems. Compared to the Sanger approach, it enabled deeper sequencing at a fraction of the time and cost. According to the authors several aspects confirmed the reliability of the method: 

  • many identical sequence types were identified independently in different nurseries, 
  • most sequence types identified with 454 pyrosequencing were identical to those from the cloning/Sanger sequencing approach and/or perfectly matched GenBank deposited sequences, and 
  • the divergence noted between sequence types of putative new Phytophthora species and all other detected sequences was sufficient to rule out sequencing errors. 

Friday, May 27, 2016

Authentication of herbal supplements - how it is done right.

Some of you might remember reading about the case in which the New York State attorney general’s office accused four major retailers of selling fraudulent and potentially dangerous herbal supplements and demanded through an official cease and desist notification that they remove the products from their shelves. They had used DNA barcoding to make this claim and were heavily criticized by the industry for doing so. That was a little over a year ago and some of the criticism persisted for a long time and revolves around the question how DNA barcoding was done in this particular case. However, one question remained and that was how to properly do authentication studies utilizing DNA-based species identification. Well, I think we have a pretty good answer to that in a paper that was published just yesterday. Colleagues here at the institute did a very thorough study that looked into DNA-based authentication of plants and used HTS as a prospective way to verify listed ingredients in herbal medicines and to detect adulteration.

What I like about this study is its balanced approach and what they found out can be summarized in one sentence I took from their conclusions: Quality control of herbal supplements should utilize a synergetic approach targeting both DNA and bioactive components, especially for standardized extracts with degraded DNA. But of course their take-home message is a bit more complex which the nicely summarized in a list of suggestions that I find very helpful for everyone involved in herbal material authentication:

  1. The NGS workflow developed in this study enables simultaneous detection of plant and fungal DNA. This protocol can be utilized by manufacturers for screening of potential mycotoxin-producing and pathogenic fungi, for quality assurance of raw plant materials, contamination control during the production process, and for assessing the purity of the final product.
  2. Sanger sequencing should not be used for testing herbal supplements, due to its inability to resolve mixed signal from samples containing multiple species. NGS-based approaches are far more superior, enabling reliable and effective detection of DNA in complex mixtures.
  3. Aside from intended or non-intended substitution, cross-contamination with non-target plant DNA may occur at any stage during growing, harvesting, manufacturing, handling or laboratory analysis of plant material. NGS-based methods would detect such traces, in addition to target DNA. By contrast, when the contaminant template is preferentially amplified, Sanger sequencing may detect only contaminant DNA, leading to biased and misleading outcomes.
  4. Diversity of fungi in herbal supplements will be determined by a combination of pathogenic, endophytic and mycorrhizal fungi naturally associated with live plant material, saprophytic fungi proliferated during drying and storage, and strains involved in the fermentation during manufacturing of bioactive components. Although this entire spectrum would be easily detected by NGS methods, interpretation of test results should focus on potential mycotoxin-producing fungi and human pathogens.
  5. Quality control of herbal supplements should utilize a synergetic approach targeting both bioactive components and DNA, especially for standardized extracts with potentially degraded DNA.

Wednesday, May 25, 2016

Top 10 new species 2016

It's that time of the year again. The Top 10 list of new species is compiled annually by ESF's International Institute for Species Exploration (IISE). The institute's international committee of taxonomists selects the Top 10 from among the approximately 18,000 new species named during the previous year. The list is made public around May 23 to recognize the birthday of Carolus Linnaeus, the 18th century Swedish botanist who is considered the father of modern taxonomy.

Giant Tortoise (Chelonoidis donfaustoi)
No animals are more immediately associated with evolution or Charles Darwin than the giant tortoises of the Galapagos. Small differences had been noticed between eastern and western populations of giant tortoises on Santa Cruz Island that were assumed to be simply genetic variation within the known species, C. porteri. A careful analysis of both genetic and morphological data, however, shows that the smaller eastern population, with perhaps as few as 250 individuals, is a distinct and new species. This discovery has immediate, important conservation implications. C. porteri has a more limited geographic range than previously believed, restricted to western and southwestern areas of the island, and care must be taken to avoid bridging the natural isolation of the two species. The new species was named in honor of a park ranger known as "Don Fausto," who worked 43 years to conserve the giant tortoises of Galapagos.

Giant Sundew (Drosera magnifica)
This is believed to be the first new species of plant discovered through photographs posted on Facebook. It is also a record-setter, being the largest sundew ever seen in the New World, growing to 123 cm (48 inches). With nearly 200 species, the sundew genus is one of the most species-rich groups of carnivorous plants. Like other sundews, it secretes a thick mucus on the surface of its leaves that entraps unsuspecting insects that are then digested to compensate for the inadequate nutrition available in the soils in which it grows. Although it is new to science, this sundew is considered to be critically endangered. It is a microendemic, known to exist only at the summit of a single mountain in Brazil, 1,550 meters (5,000 feet) above sea level. Although locally abundant, its habitat is isolated, limited and fragile.

Hominin (Homo naledi)
Fossil remains of this previously unknown species of the genus Homo represent at least 15 different individuals, the largest collection of remains of a single species of hominin ever discovered on the African continent. Anatomical features of this new hominin found in South Africa are a mixture of those of Australopithecus with other Homo species, combined with several features not known in any hominin species. Features shared with other Homo species include complex functional locomotion, manipulation and mastication systems. Similar in size and weight to a modern human, and with humanlike hands and feet, the new species has a braincase more similar in size to earlier ancestors living two million to four million years ago, as well as shoulders, pelvis, and rib cage more closely resembling earlier hominins than modern humans. The exact age of the remains, once determined, will have implications for the early history of our genus.

Isopod (Iuiuniscus iuiuensis)
How it made the Top 10: This might be the 15 minutes of fame that isopods (crustaceans that live in water or on land; think "pillbug") have been waiting for. This blind, unpigmented, multilegged animal represents a new subfamily, genus, and species of amphibious isopod discovered in a South American cave. It has a behavior never seen before in its family: It constructs shelters of mud. The cave where the species was discovered has its only entrance at the bottom of a sinkhole and its inner chambers are flooded during the rainy season. Eight other caves in the region were explored, but the new species was found in only one. This isopod, just over 9 mm (a third of an inch) in length, builds spherical, irregularly shaped shelters in which it molts. While shedding its exoskeleton, it is especially vulnerable to predators. Some Palearctic isopods are known to build shelters, but this is a first for the New World. The new species is unique among its Brazilian cave-inhabiting relatives in having tapering plates at the base of its legs that give it a spiny appearance.

Anglerfish (Lasiognathus dinema)
How it made the Top 10: If this fish from the Gulf of Mexico, barely 50 mm (about two inches) long, were angling for ugliest among the Top 10 New Species, it might succeed. It was discovered during a Natural Resource Damage Assessment process conducted by the National Oceanic and Atmospheric Administration after the Deepwater Horizon oil spill in 2010. Different species of anglerfish can be distinguished visually only by details of the unusual structure called the esca that is projected over their heads like -- ironically -- a fishing pole. This organ is located at the tip of a highly modified, elongated dorsal ray. Rays are the spines that add support to the dorsal fin. The esca in some anglerfish is home to symbiotic bacteria that are bioluminescent, producing light that is a rare commodity in the depths of the ocean and is presumed to attract prey. Either way, these are among the most unusual features of any fish in form and function.

Seadragon (Phyllopteryx dewysea)
Seadragons are related to seahorses and are a unique combination of beautiful and bizarre. This new kind of marine fish, 240 mm (nearly 10 inches) in length, is a striking shade of ruby red with pink vertical bars and light markings on its snout. Only the third known species of seadragon, it is found in slightly deeper and more offshore waters than the related common or leafy seadragons. The discovery was made off the coast of Western Australia. Aside from its spectacular appearance, it is a reminder of what we have yet to discover about marine species diversity. If ruby red dragons nearly a foot long in shallow waters have escaped our attention, what else do we not yet know?

Tiny Beetle (Phytotelmatrichis osopaddington)
This species owes its charming Latin name to Paddington Bear, a lovable character who became a classic in children's literature after he was introduced in 1958. As the story goes, he showed up one day in Paddington Station, London, with a sign that said, "Please look after this bear." Like him, the new beetle hails from Peru. The researchers hope the new species' name will draw attention to the threatened Andean spectacled bear that inspired the Paddington books. Nearly 25 of these tiny beetles could line up, head to tail, before they reached the one-inch mark on a yardstick. They have a peculiar way of life. A little-studied world of animals, from insects to frogs, make their homes in pools of water that accumulate in hollows of plants, such as tree holes and the leaf bases of bromeliads (tropical and subtropical plants with short stems and stiff, open spiny leaves); these water bodies are called phytotelmata. This species was discovered in such water, gathered in leaf rolls of a non-native, cultivated plant, sparking questions about its food, breeding and native hosts. This is a featherwing beetle, the family that includes the smallest known group of beetles and which is named for the distinctive shape of their wings. Most of them are found on the forest floor where they feed on decomposing materials. So far, the plants documented as hosts to the new species belong to the Zingiberales, an order of flowering plants that include ginger and banana among 2,000 others.

New Primate (Pliobates cataloniae)
This ape, nicknamed "Laia" by her discoverers, was a small female that lived about 11.6 million years ago in what is now Spain, climbing trees and eating fruit. Fragments of her remains were discovered in a landfill in Catalonia, and she has challenged a lot of assumptions about the origins of, and relations among, living apes, gibbons and humans. It appears she was 4 to 5 kg (roughly 9 to 11 pounds) in weight, suggesting a diminutive height of about 43 cm (17 inches). She lived before the lineage containing humans and great apes had diverged from its sister branch, the gibbons, and she appears to be sister to the three combined. Her discovery suggests greater morphological diversity existed at that time, in the Miocene, than previously thought, and raises the possibility that early humans could have been more closely related to gibbons than the great apes. Her name is a popular Catalan diminutive of the name "Eulàlia," the original patron saint of the city of Barcelona.

Flowering Tree (Sirdavidia solannona)
This new tree species was "hidden" just meters from the main road in the Monts de Cristal National Park, in Gabon, which was thought to have already been well explored by science. Its small size, less than 6 meters (20 feet) high with a diameter of 10 cm (about four inches) might have caused it to be overlooked during inventories that focus on larger trees. It is so different from related members of the Annonaceae family of flowering plants, based on both morphology and molecular data, that it was described as a new genus, too. Its closest relative is also a genus with a single species, Mwasumbia, found on the other side of the African continent in Tanzania some 3,000 km (1,865 miles) away. Interestingly, the new species' flowers resemble those of certain Solanum, the genus of the nightshade family that includes potatoes and tomatoes, that are associated with the "buzz" pollination syndrome. In this syndrome, flowers have reflexed petals exposing the stamens and pistils that bees "sonicate" by creating vibrations of the air with their wings to extract and spread pollen. If buzz pollination is confirmed, it would be the first example in this family or any other early-diverged flowering plant, and an unexpected example of convergent reproductive evolution.

Sparklewing (Umma gumma)
This new damselfly is just one of a staggering number of newly discovered dragonflies and damselflies from Africa. Sixty new species were reported in a single publication this year, the most for any single paper in more than a century and a surprising leap forward in knowledge for one of the better-known insect orders. Most of the new species are colorful and so distinct they are identifiable from photographs alone, emphasizing that not all unknown species are small, indistinct or cryptic in appearance or habits. Given that the genus name is Umma, it was quick work to give this lovely and delicate damselfly a name that might be familiar to rock-and-roll fans: the band Pink Floyd named its 1969 double album Ummagumma.

Tuesday, May 24, 2016

Yes, I trust BOLD!

Another week and another paper for which I feel the need to comment. This time it is not about quality as I am convinced the authors know what they are talking about. In fact they are taxonomic experts for a family of true bugs (Cydnidae) but that doesn't save them from my blog post as I think they unfairly criticized BOLD for doing its job.

Numerous mistakes in taxonomy, the relevance of the taxa names, and species misidentifications in BOLD version 3 were found and, more importantly, similar errors were detected in BOLD version 4 as well. We suggest that if the BOLD system is presumed to be taxonomically trustworthy, it can’t exist without an adequate a priori identification of barcoded specimens. Otherwise, the erroneous data deposited onto the BOLD platform will have a negative impact on studies in which molecular data imported from BOLD are utilized.

Just to clarify, BOLD versions 3 and 4 are just different user interfaces with different sets of tools. The underlying database is the same. But what are we talking about? What's the extent of the problem?

Our search revealed 220 specimens, including 106 specimens with barcodes, so the percent of misidentified specimens is 3.78% for specimens with sequences, and 1.81% for all specimens. If nomenclatural issues are added to the cases of misidentifications, the percentages are 7.55% (specimens with barcodes) and 3.64% (all specimens).

It is not my intention to belittle the issue the authors point out. There are some errors they came across and it is good to point those out although I think there are different ways to do that (see below) and certainly without such bold statements as shown at the top of this post. To put it into perspective 3.78% of 106 barcodes out of 5 Million were misidentified. Too much for the authors: any information connected with DNA barcodes and deposited into databases such as GenBank and BOLD should be beyond any suspicion of inaccuracy or unreliability.

This means error is not an option for the work of a taxonomist and such comments are grist to the mill for those who anxiously keep their BOLD data private as they are afraid that somebody will find a mistake they made and they will be in the pillory for being wrong instead of working with the community to make the data better. 

BOLD is a workbench and not the holder of the all-ultimate taxonomic truth. Actually, BOLD's accuracy depends on its user community and its quality grows with the amount of experts that help building an maintaining it. All the errors listed if meant as improvements of the database are welcome but such errors are by no means different from a misidentified specimen in a museum drawer that might be used for identifying other specimens by direct morphological comparison. Would that justify a paper title such as: In (fill in museum of choice) we trust?"

BOLD needs the input of the user community to minimize error and stay on top of nomenclatural and systematic changes and revisions. To me the most useful thing to do is to stop complaining and start helping. If one finds errors that might hamper future use of the barcode reference library, best is to engage with the community and especially the data owner. BOLD was developed for this kind of interaction. It facilitates dialogue and resulting data improvement. It even has vetted vocabulary and taxonomy controlled by humans. I don't know of any other tool better suited for that, so why a public display of errors in form of a paper? 

But first and foremost - stop bashing BOLD for errors and mistakes made by its users and for enabling the community to actually identify and rectify such errors. Without the detail of information available through this interface, the authors of this paper would have never been able to identify all the problems they listed let alone have a chance to contribute to any solution.

Thursday, May 19, 2016

Faster Protein Sequence Alignment

A common starting point for the computational analysis of proteins is the construction of a multiple sequence alignment (MSA). Insofar as they result from protein functional similarities and differences, the patterns of residue conservation and divergence within such an alignment provide clues to biological function. Of course the biological relevance of any observed patterns depends upon an alignment’s accuracy, and alignments of larger sequence sets have greater statistical power. For biologically appropriate scoring systems applied to more than a very small number of sequences, however, no optimization procedures are known that are both tractable and rigorous; thus all practical MSA programs rely upon heuristic methods.

Indeed most commonly used alignment tools typically compare sequences and rank alternatives at each branching step based on available information to decide which branch to follow. This is faster than an exhaustive search but still takes a prohibitively long time to compute for sets of a hundred thousand or more related sequences. In addition there is no guarantee that a heuristic search provides the best solution but rather an approximation. 

Two researchers have now developed a new algorithm that is both faster and more accurate. Instead of comparing sequences to each other, it compares each sequence to an evolving statistical model. This approach is not only faster, but is also better at finding biologically relevant signals within such alignments. Their new program is called GISMO, an acronym for "Gibbs Sampler for Multi-Alignment Optimization." Gibbs sampling, a statistical technique for solving highly complex problems, is a central feature of the approach.

At this point GISMO works only for protein alignments and the authors are the first to point out that there is room for improvement. Because researchers have been finding ways to speed up and improve conventional methods for decades and because GISMO takes such a new and different approach, I am confident that we can make GISMO even faster and more accurate going forward. The reason - For large sequence sets, this approach offers clear advantages in alignment accuracy over the most popular programs currently available.

Wednesday, May 18, 2016

Bye, bye K2P

Accurate estimates of biodiversity are required for research in a broad array of biological subdisciplines including ecology, evolution, systematics, conservation and biodiversity science. The use of statistical models and genetic data, particularly DNA barcoding, has been suggested as an important tool for remedying the large gaps in our current understanding of biodiversity. However, the reliability of biodiversity estimates obtained using these approaches depends on how well the statistical models that are used describe the evolutionary process underlying the genetic data.

In a new study researchers from Honolulu assess substitution model adequacy for describing genetic variation and for estimating species richness from barcoding data. One of their main motivation for this is the almost ubiquitous use of the Kimura 2-parameter (K2P) model. Earlier studies have already shown that this is perhaps a poor choice because it doesn't always fit well at the species level and provided unreliable estimates of the number of OTUs.

What the colleagues did was first to assemble many (more than 2000) empirical data sets. For each of those they selected a ‘best fit’ model of molecular evolution and performed DNA barcoding analyses under both the chosen model and the K2P model to estimate the number of OTUs. For the latter they used either ABGD or hCluster. Finally, they conducted a Bayesian phylogenetic inference and posterior predictive simulation to assess the fit of each substitution model to each data set. 

Not surprisingly the more complex model seem to fit better. The most frequently selected model for the barcode data sets was GTR + Γ, followed by HKY + Γ. The model adequacy assessments showed that the K2P model was found to be within the 95% highest posterior density of only 3% of all datasets. 

But the more interesting find is that depending on the method and threshold employed, the total number of OTUs varied considerably (4%–31%) meaning that model choice has a substantial impact on the number of operational taxonomic units identified.

Take home message:

We demonstrate that the widely followed practice of a priori assuming the Kimura 2-parameter model for DNA barcoding is statistically unjustified and should be avoided. Using both data-based and inference-based test statistics, we detect variation in model performance across taxonomic groups, clustering algorithms, genetic divergence thresholds and substitution models. Taken together, these results illustrate the importance of considering both model selection and model adequacy in studies quantifying biodiversity.

Tuesday, May 17, 2016

Biodiversity protects fish

Biodiversity is more than a pretty face. Preserving biodiversity is not just an aesthetic or spiritual issue - it's critical to the healthy functioning of ecosystems and the important services they provide to humans, like seafood.

The accelerating loss and shifts of species across the globe have troubled scientists and the public for a while already. But, believe it or not, the question of whether biodiversity offers practical value - for humans and ecosystems - remained controversial. A new study, just published in the PNAS, offers the most comprehensive proof yet that preserving e.g. marine biodiversity can benefit people as much as it benefits the oceans.

Reef Life Survey is a comprehensive program that has conducted more than 4,000 underwater surveys of more than 3,000 fish species in 44 countries around the world. Many of the surveyors were volunteer citizen scientists, about a third of whom with no scientific background. Volunteer divers actually received training from some of the program's lead scientists at the University of Tasmania in order to enable them to collect data using standardized methods.

With this comprehensive global dataset on marine biodiversity involving standardized counts, the researchers tracked how 11 different environmental factors influenced total fish biomass on coral and rocky reefs around the world. Surprisingly, one of the strongest influences was biodiversity: Species richness and functional diversity enhanced fish biomass. The boost in fish resources provided by biodiversity was second only to that of warm temperatures.

Temperature showed a more complex relationship with fish biomass: Warmer ocean temperatures tended to boost fish biomass on average, while wider temperature fluctuations hindered it. But biodiversity made fish communities more resilient against changing climate. In communities with only a few species, fish biomass tended to increase with rising temperatures until seas warmed above 20°C, at which point biomass decreased. Communities with many species remained stable at these higher temperatures.

The researchers found a similar buffering effect of diversity against temperature swings. While both high- and low-diversity communities were less productive under fluctuating temperatures, high-diversity communities suffered only half as much as low-diversity ones. The researchers suspect communities with more species are better equipped to handle temperature changes because they have more of their bases covered. When temperatures fluctuate, a community with numerous species has better odds that at least a few species can thrive in the new normal.

This work is a critical step forward in linking insights from experiments in buckets and garden plots to the larger world. It shows that experimental ecologists have in fact been on the right track for 20 years, and that biodiversity is paramount to how natural systems work. The results demonstrate that preserving local biodiversity is not only an ethical directive with aesthetical and genetic insurance value, but that it is an imperative for human life,