Complete Genome Sequence of a Virulent Streptococcus agalactiae Strain, 138P, Isolated from Diseased Nile Tilapia

Streptococcus agalactiae strain 138P was isolated from the kidney of diseased Nile tilapia in Idaho during a 2007 streptococcal disease outbreak. The full genome sequence of S. agalactiae 138P is 1,838,701 bp. The availability of this genome will allow comparative genomics analysis to identify genes for antigen discovery and vaccine development.

positive pathogen causing human meningitis (1)(2)(3), neonatal sepsis (4), and pneumonia (5). In cattle, S. agalactiae is a major cause of bovine mastitis, a dominant health disorder affecting milk production (6,7). In fish, S. agalactiae causes meningoencephalitis (8). In 2001, S. agalactiae was responsible for a massive fish kill at the Kuwait Bay, causing a loss of 2,500 metric tons of wild mullet (Liza klunzingeri) (9). Large-scale streptococcal outbreaks occurred frequently in tilapia farms of China from 2009 to 2011, involving Ͼ95% of farms, with a cumulative mortality of 30 to 80% (10)(11)(12). Reports from China indicate that the prevalent strains of Streptococcus causing disease in tilapia have shifted from Streptococcus iniae to S. agalactiae (10)(11)(12). In 2007, a streptococcal disease outbreak occurred in cultured Nile tilapia in Idaho. The strain 138P is a representative isolate collected from that disease outbreak.
The genome of S. agalactiae 138P was sequenced using the Illumina 1500 HiSeq platform. BioNumerics (Applied Maths) was used to de novo assemble a total of 30,870,348 sequence reads with an average length of 100 bp. The genomic contigs were used to search for sequence homology with deposited genome sequences in the GenBank nucleotide database using BLASTn. The first 6,000 bp of the genome sequence of S. agalactiae 138P shared 100% identities with the S. agalactiae strain 2-22 genome (accession no. FO393392). Therefore, strain 2-22 was used as a reference genome for this study. The whole genome of S. agalactiae 138P is 1,838,701 bp with a GϩC content of 35.5%. The RAST server (13) predicted 1,891 coding sequences, including 305 involved in carbohydrate catabolism, 160 in protein metabolism, 141 in synthesis of amino acids and derivatives, 135 in cell wall and capsule synthesis, 103 in RNA metabolism, 101 in DNA metabolism, 95 in cofactors, 89 in nucleoside and nucleotide synthesis, 72 in fatty acid synthesis, 65 in membrane transport, 51 in virulence, 51 in stress response, 35 in phosphorus metabolism, 30 in cell division and cell cycle, 25 in regulation and cell signaling, 16 in iron acquisition and metabolism, 15 in potassium metabolism, and 3 in motility and chemotaxis. RAST also predicted 5 copies of 5S rRNA, 6 copies of 16S rRNA, and 6 copies of 23S rRNA.
Based on comparison with the genome of S. agalactiae 2-22, S. agalactiae 138P had the following three added features: (i) a 144-bp sequence between 524,842 nucleotides (nt) and 524,985 nt coding for a hypothetical protein, (ii) a 198-bp sequence between 1,006,860 nt and 1,007,057 nt coding for a transcriptional regulator belonging to the Cro/Cl family, and (iii) a 393-bp sequence between 1,009,336 nt and 1,008,944 nt coding for a hypothetical protein.
Nucelotide sequence accession number. The complete genome sequence of S. agalactiae 138P was deposited in GenBank under the accession no. CP007482.
The use of trade, firm, or corporate names in this publication is for the information and convenience of the reader. Such use does not constitute an official endorsement or approval by the U.S. Department of Agriculture or the Agricultural Research Service of any product or service to the exclusion of others that may be suitable.
We thank Victor Panangala (USDA-ARS) and Mark R. Liles (Auburn University) for critical reviews of the manuscript. We thank Lee Zhang (Auburn University Genomics and Sequencing Laboratory) for his excellent sequencing work and Beth Peterman (USDA-ARS) for her technical support.