Representatives databaseΒΆ
In order to do read-level taxonomic binning, representative databases need to be compiled. Four default databases were designed cover most of the genetic diversities in metagenomic samples.
ANI 98% database for bacteria and archaea
sparse query --dbname refseq --default representative | python SPARSE.py mapDB --dbname refseq --seqlist stdin --mapDB representative
ANI 99% database for bacteria and archaea (always use together with representative database)
sparse query --dbname refseq --default subpopulation | python SPARSE.py mapDB --dbname refseq --seqlist stdin --mapDB subpopulation
ANI 99% virus database
sparse query --dbname refseq --default Virus | python SPARSE.py mapDB --dbname refseq --seqlist stdin --mapDB Virus
ANI 99% eukaryota database (genome size <= 200MB)
sparse query --dbname refseq --default Eukaryota | python SPARSE.py mapDB --dbname refseq --seqlist stdin --mapDB Eukaryota
Custom databases
In order to index a differet set of references into a representative database, see [here](custom.md)