lunes, abril 27, 2009

A phylogeny of 73060 eukaryotes

Finally, the behemoth has seen the light :). Our paper with a parsimony analysis of 73060 eukariotic species (and 7800 mol+morf characters) was just published (as “online early”) in Cladistics [doi:10.1111/j.1096-0031.2009.00255.x].

Pablo does a wonderful work optimizing every aspect of the tree-searches in TNT. And all of the guys worked really hard to manage that amount of data!

At first I was surprised with the high accuracy of the trees founded, because the data set is full of missing entries. Also, I fill happy because the inclusion of morphological data, even at this huge scale, produce better results than molecules alone!

Just few months ago, this was posted in dechronization:
He [Cassey Dunn] makes a convincing case for the idea that a revolution in analytical techniques will be needed as we enter an era during which computational capabilities will be more limiting than data availability.
I think our study shows exactly the inverse: that our actual search capabilities are good enough, but we do not have sufficient data (the largest gene set is SSU with 20000 species, and a handful of genes has more than 10000 species).

The second lesson?... We do not need super-trees!

1 comentario:

Mike Keesey dijo...