Complete Genome Sequence of emm28 Type Streptococcus pyogenes MEW123, a Streptomycin-Resistant Derivative of a Clinical Throat Isolate Suitable for Investigation of Pathogenesis.

We present here the complete genome sequence of Streptococcus pyogenes type emm28 strain MEW123, a streptomycin-resistant derivative of a pediatric throat isolate. The genome length is 1,878,699 bp, with 38.29% G+C% content. The genome sequence adds value to this virulent emm28 representative strain and will aid in the investigation of streptococcal pathogenesis.

responsible for a great diversity of human disease manifestations (1). Genotyping by DNA sequencing of the variable region of the emm gene distinguishes S. pyogenes isolates; the four most prevalent emm types causing pharyngitis and invasive disease in North America are types 1, 3, 12, and 28 (2)(3)(4). S. pyogenes type emm28 has a particular association with female urogenital infections, including vulvovaginitis, endometritis, and puerperal sepsis (5)(6)(7)(8). We previously described the emm28 strain MEW123 as a streptomycin-resistant derivative of a pediatric throat isolate, which is amenable to genetic manipulation and establishes prolonged carriage in a murine vaginal colonization model (9). We report here the complete MEW123 genome sequence to provide a reference for future studies.
Chromosomal DNA was isolated using the Wizard genomic DNA purification kit (Promega, Madison, WI), and sequenced using the PacBio RS II sequencer (Pacific Biosciences, Menlo Park, CA). Samples were prepared according to the manufacturer's protocols with the P6 polymerase kit and C4 sequencing reagents, with the exception of an increase to 1 h polymerase binding and 1 h binding to magnetic beads. For library construction, DNA was sheared to fragments of~23,700 bp and isolated using the BluePippin electrophoresis system (Sage Science, Inc., Beverly, MA). Sequences were collected using one single-molecule real-time (SMRT) cell. This generated 70,159 reads, each with a length of~16,100 bp, for a total of 1,131.6 Mb of sequence data (~300-to 500-fold coverage). The sequence was assembled using Celera version 8.3rc2 (10,11). The resulting single scaffold was indexed and aligned against the fastq reads with BWA version 0.7.12, using BWA-MEM. The resulting .SAM file was sorted, indexed, and converted to .bam and .bai files using SAMtools version 1.2 (12,13). Error correction was performed with Pilon version 1.12 (14) and Harvest tools version 1.2, employing parsnp and gingr (15). Genome overlap at the ends was identified with SeqEdit (DNAStar, Madison, WI), and trimmed manually to have position dnaA as the starting point.
Preliminary annotation was performed using Prokka version 1.11, with a reference library generated from emm28 strain MGAS6180 (accession no. NC_007296.1) (16). Upon submission to Gen-Bank, the annotations were repeated using the NCBI Prokaryotic Genome Annotation Pipeline for database consistency.
Nucleotide sequence accession number. This genome sequence has been deposited in GenBank under the accession no. CP014139. The version described in this paper is the first version.

ACKNOWLEDGMENTS
We thank Robert Lyon, Christina McHenry, Katherine Borysko, and the University of Michigan DNA Sequencing Core Facility for their technical expertise.
This work was supported by the University of Michigan Department of Pediatrics and Communicable Diseases and grant NIH K12 HD028820. We acknowledge the use of the S. pyogenes MLST database, which is located at Imperial College London and is funded by the Wellcome Trust.

FUNDING INFORMATION
This work, including the efforts of Michael Edmund Watson, was funded by HHS | NIH | National Institute of Child Health and Human Development (NICHD) (5 K12 HD 028820).
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.