ABSTRACT
The human commensal bacterium Streptococcus salivarius plays a major role in the equilibrium of microbial communities of the digestive tract. Here, we report the first complete genome sequence of a Streptococcus salivarius strain isolated from the small intestine, namely, HSISS4. Its circular chromosome comprises 1,903 coding sequences and 2,100,988 nucleotides.
GENOME ANNOUNCEMENT
Streptococcus salivarius is a Gram-positive bacterium and a member of the salivarius group of streptococci that preferentially colonizes the upper part of the human digestive tract. Remarkably, this commensal bacterium is one of the pioneer colonizers of oral and small intestine mucosal surfaces in newborns (1). It remains predominant in the oropharyngeal (1, 2) and gastrointestinal (3) tract throughout the human life span and is proposed to contribute to oral and gut health (4, 5). It was reported that S. salivarius might have specific inhibitory effects on pathogenic streptococci (6–8), or on bacteria involved in periodontal infections (9). In addition, this species is capable of in vivo modulating the periodontal and colon pathogen-induced inflammatory response (10, 11). Based on these beneficial effects, it is marketed as a probiotic (12). However, the impact of indigenous S. salivarius, as well as nonnative strain supplementation on gut microbiota, remains poorly investigated.
The Streptococcus salivarius HSISS4 was originally isolated in an ileostomy effluent sample from a 79-year-old male (13, 14) but was also suggested to colonize the oral cavity since closely related isolates were identified from saliva samples (15).
A draft genome of strain HSISS4 was previously assembled (13) using a combination of 454 and Illumina HiSeq sequencing technologies, which yielded a chromosome fragmented into 150 contigs (GenBank accession no. ASKD01000000). Here, we carried out a de novo hybrid assembly from a PacBio sequencing (generation of 151,277 sequence reads with an average length of about 3,800 bp) that ensures approximately 144-fold sequence coverage of the entire genome and an Illumina HiSeq2500 sequencing (generation of approximately 1,250,000 high-quality paired-end 125 bp-sequence reads) that enables gap closure and corrections of PacBio sequencing errors.
We performed a systematic reannotation of the whole genome, linking the previous gene annotation to our new gene nomenclature. The protein-coding sequences (CDSs), tRNAs, and rRNAs were predicted using the Rapid Annotation using Subsystem Technologies server (16). Moreover, we performed a proteomic comparative analysis with the Streptococcus thermophilus LMG18311 strain (17) in order to enrich our genomic database. In parallel, the metabolic pathways were reconstructed by using the Kyoto Encyclopedia of Genes and Genomes (18).
The circular chromosome of S. salivarius HSISS4 comprises 2,100,988 bp with an overall G+C content of 40.2%. It encodes 1,903 putative CDSs, among which 1,533 (81.6%) encode proteins with a predicted biological function. The genome encompasses 33 repeated elements and 68 tRNA genes and 18 rRNA genes allocated in 6 operons.
Comparative analyses with S. thermophilus LMG18311 and the oral isolate S. salivarius JIM8777 revealed that their genome encodes orthologs of HSISS4 genes for 1,378 and 1,695 CDSs, respectively (cutoff: 80% protein identity and 80% length coverage). This may indicate that, within the salivarius cluster, species and strains have specifically evolved to better cope with their ecological niche.
Nucleotide sequence accession number.The complete genome sequence of Streptococcus salivarius strain HSISS4 has been deposited at GenBank under accession number CP013216.
ACKNOWLEDGMENTS
We thank Damien François for his assistance in bioinformatics. We declare no conflicts of interest. This research was supported by the BELSPO agency in the frame of the 7th IUAP consortium.
FOOTNOTES
- Received 27 November 2015.
- Accepted 9 December 2015.
- Published 4 February 2016.
- Copyright © 2016 Mignolet et al.
This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 Unported license.