Wednesday, October 08, 2008

Metagenome Sequence Simulators

ResearchBlogging.orgRichter DC, Ott F, Auch AF, Schmid R, & Huson DH (2008). MetaSim: a sequencing simulator for genomics and metagenomics. PloS one, 3 (10) PMID: 18841204

An article from the Huson's Group at Tübingen University has just came out in the Open Access (and scientific publishing innovator) journal PLoS ONE, describing MetaSim, a software to produce artificial or syntetic or in silico metagenomes out of a selection of completely sequenced genomes.

This is just "heaven-sent" for me since I've been working on a set of syntetic metagenomes for the past two months, and will be happy to use this software first hand like... today. It seems that the software not only lets you choose the source genomes from a phylogenetic tree (figures reproduced here from the original article al PLoS ONE thanks to the Creative Commons License), but also choose from three different type of sequencing technology output (Sanger, 454 and Illumina) and generate theorethical metagenome.

This is the continuation of a very important change in genomic sciences, moving from experiments far too expensive or long to be replicated and hence out of hard statistical comparision, to null-model based in silico genomic analysis.

The first effort to analyze the true scope of metagenomic analysis was presented by Kostas Mavrommatis and others from the Genome Biology group at JGI (unfortunately published in an non-OA journal), where they produced three simulated metagenomes of contrasting complexity to asses assembly, gene prediction and annotation (SPOILER: the best combination assesed was Arachne assembler with Fgenesb predictor and PhyloPhytia for binning, and BLAST "performed poorly" as usual). This work also produced a database for the Fidelity of Analysis of Metagenomic Samples (FAMeS), a great effort to standarize metagenomic analysis software. A great alternative is ProxyGene annotation, as reported by the Markowitz group.

ResearchBlogging.orgMavromatis K, Ivanova N, Barry K, Shapiro H, Goltsman E, McHardy AC, Rigoutsos I, Salamov A, Korzeniewski F, Land M, Lapidus A, Grigoriev I, Richardson P, Hugenholtz P, & Kyrpides NC (2007). Use of simulated data sets to evaluate the fidelity of metagenomic processing methods. Nature methods, 4 (6), 495-500 PMID: 17468765

I'll play a little with the software and post some of my impressions here... and maybe in the original PLoS ONE webpage since it is totally open to post-publication review!!!!

You can download MetaSim at Huson's Labpage!!!