Thursday, September 6, 2012


A DNA Barcode is a short standardized sequence enabling species discrimination

This short definition contains all important elements of DNA Barcoding and this post is about one very important word in it: standardized

Standards are paramount in our work and it is concerning that an increasing number of studies use different gene regions and their authors call it DNA Barcoding. I find it this even more puzzling when it comes to vertebrates where the reasons for not using COI are either traditional or even worse - the result of convenience. 

It is always exciting when new species are discovered by using DNA methods and I fully support the idea that any phylogenetic approach needs additional genetic markers, the same is true if a sufficient placement can't be made using DNA Barcoding alone. However, I will never be able to compare my barcode data on fishes with data of colleagues who decided to use cyt b instead. This is already frustrating when you try to build phylogenies. 

Phylogenetics is a good example what can go wrong when you don't agree on standards. In essence everybody can use whatever marker they think works best for them. Fortunately, that started to change over the last few years. Well, we are living in a free world, and I am the last who is going to tell somebody else how to do their job but the result is that most data that is publicly available isn't necessarily comparable. With years of experience in trying to assemble datasets for particular taxonomic groups in order to build robust phylogenies I consider it a waste of money and ignorant not to use data other researchers have generated before. However, often I simply can't do that as the allowable amount of unknown data in an analysis is limited. There isn't sufficient overlap between datasets as communities of researchers haven't agreed on common gene regions for this kind of analysis. On the contrary there were long lasting disputes about which region(s) to use. Some battles still continue.

For me DNA Barcoding represented a big leap forward by proposing standard regions (and features) and in some cases it took longer find a consensus among all scientists (e.g. the plant barcode) but they did. Nevertheless, currently we have standard regions for 3 major groups: 
  • Cytochrome Oxidase I (COI or COXI) for animals
  • Ribulose-bisphosphate carboxylase gene (rbcL) and the chloroplast maturase K gene (matK) for plants
  • The complete ITS1 spacer, the 5.8S gene, and the ITS2 spacer as a single contiguous sequence (ITS) for fungi

There might be compelling reasons (especially technical ones) not to use one of those in the particular groups and people are of course always entitled to chose whatever works best for them but I would like to make two suggestions:

(1) Use the term DNA Barcoding only if you use one of the markers listed above to indicate that you follow agreed upon community standards.
(2) Give the markers at least a try to ensure that you've done what you can to contribute to a global effort and the community of your colleagues.

