Publications that introduced the EzBioCloud 16S database
Our database has been introduced in the following three publications (The numbers of citations are as of Mar. 17, 2018):
- Yoon, S. H., Ha, S. M., Kwon, S., Lim, J., Kim, Y., Seo, H. & Chun, J. (2017). Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies. Int J Syst Evol Microbiol 67, 1613-1617 [Learn more].
- It is now No. 1 cited paper published in 2017 in the “Microbiology” & related category (by the Web of Science; 298 times cited out of 57,143 publications).
- 496 times cited by Google Scholar. Check out the publications citing this article.
- Kim, O. S., Cho, Y. J., Lee, K., Yoon, S. H., Kim, M., Na, H., Park, S. C., Jeon, Y. S., Lee, J. H.& other authors (2012). Introducing EzTaxon-e: a prokaryotic 16S rRNA gene sequence database with phylotypes that represent uncultured species. Int J Syst Evol Microbiol 62, 716-721 [Learn more].
- It is the No. 1 cited paper published in 2012 in the “Microbiology” & related category (by the Web of Science; 3,045 times cited out of 25,438 publications).
- 3,876 times cited by Google Scholar. Check out the publications citing this article.
- Chun, J., Lee, J. H., Jung, Y., Kim, M., Kim, S., Kim, B. K. & Lim, Y. W. (2007). EzTaxon: a web-based tool for the identification of prokaryotes based on 16S ribosomal RNA gene sequences. Int J Syst Evol Microbiol 57, 2259-2261 [Learn more].
The content of the EzBioCloud 16S database
EzBioCloud 16S database contains the following information:
- Standardized 16S rRNA gene sequence representing reference taxa
- All sequences are extracted between two most popular PCR primers (27F-1492R), so similarity calculation should be consistently carried out.
- In principle, single 16S is assigned to single reference taxon.
- The reference taxa mean
- Currently validly published taxonomic names
- Some of the invalid names (that are likely representing distinct species).
- Candidatus taxa
- Unnamed phylotypes that do not belong to the above. These include 16S amplicons and genome sequences.
- Complete taxonomic hierarchy is given for all 16S sequence (from species to phylum). The hierarchy is based on the maximum likelihood phylogenetic tree of 16S with consideration of the currently accepted classification.
Source of 16S data
Since we have tried to secure the best quality of 16S sequences, the sources of 16S can vary and one of the followings:
- NCBI 16S amplicon sequences of validly published taxa: e.g., AY692362 for Adiaceo aphidicola
- NCBI 16S amplicon sequences of phylotypes: e.g., AJ290038 for AJ290038_s (phylotype corresponding species)
- 16S sequence extracted from NCBI genome assembly: e.g., CP000238 for Baumannia cicadellinicola.
- 16S sequence extracted from JGI genome assembly (this genome data may not be available in NCBI): e.g. jgi.1096475 for phylotype jgi.1096475_s in the genusGeodermatophilus.
- 16S sequence compiled from Pacific Biosciences full-length sequencing of microbiome samples. These represent high-quality 16S sequences using PacBio’s circular consensus sequencing (ccs) technology: e.g. PAC000364 for phylotype PAC000364_s.
- 16S sequence extracted from internally assembled genome data: e.g. CLG_48533 for Arthrobacter oryzae.
Consequently, not all data are available in NCBI database. However, all data are freely accessible through www.ezbiocloud.net.
Why 16S sequences from genome assemblies were used in EzBioCloud, instead of PCR
- Genome assembly is usually in better quality than PCR amplicon sequencing. Typical NGS sequencing resulted in 50X or higher sequencing depths of coverage.
- When we include genome sequence-derived 16S to EzBioCloud database, we always check the quality by manual alignment using secondary structural information. In our experience, using genome sequences we can improve the quality of 16S databases for reference purposes.
The EzBioCloud team / Last edited on Mar 17, 2018