Draft Whole-Genome Sequences of Periodontal Pathobionts Porphyromonas gingivalis, Prevotella intermedia, and Tannerella forsythia Contain Phase-Variable Restriction-Modification Systems

ABSTRACT Periodontal disease comprises mild to severe inflammatory host responses to oral bacteria that can cause destruction of the tooth-supporting tissue. We report genome sequences for 18 clinical isolates of Porphyromonas gingivalis, Prevotella intermedia, and Tannerella forsythia, Gram-negative obligate anaerobes that play a role in the periodontal disease process.

P eriodontal disease describes a range of mild to severe inflammatory oral bacterial infections that can ultimately cause destruction of the tooth-supporting tissues. Periodontitis affects 10 to 15% of the adult population worldwide (1). The host inflammation seen in periodontitis is provoked by oral bacteria and a number of species, including Porphyromonas gingivalis, Prevotella intermedia, and Tannerella forsythia, have been shown to be disease associated (2); P. gingivalis, in particular, is regarded as a keystone pathobiont subverting host defenses (3). Here, we describe the draft whole-genome sequences (WGS) of 18 anaerobic bacterial strains isolated from patients; the strains were selected from the culture collection of author W. Wade, obtained during previous studies. In those studies, subgingival plaque samples were collected from periodontal pockets Ͼ8 mm in depth in subjects with advanced periodontitis by means of a curette. Samples were cultured on fastidious anaerobe agar (FAA, Lab M) supplemented with 5% horse blood and incubated anaerobically for up to 7 days. P. intermedia, T. forsythia, and P. gingivalis strains were identified by 16S rRNA analysis. Genomic DNA isolated from all three species (Genomic DNA clean and concentrate kit, Zymo Research) was used to prepare libraries (Nextera DNA library preparation kit) which were analyzed on Illumina MiSeq. Sequence reads were quality controlled using Trimmomatic (4) and WGS assembled using SPAdes v3.6.2 (5). Genome size and assembly quality were assessed using QUAST v4.3 (6) (see Table 1).
Multilocus sequence typing (MLST) of the P. gingivalis WGS using pubMLST (pubmlst.org) identified two strains as sequence type 30 (ST30); however, six strains presented with novel STs, and the rest had incomplete MLST profiles (see Table 1). A core genome analysis of the P. gingivalis WGS, using the Harvest 1.0 program suite (http://harvest.readthedocs.io) (7), indicated that they all nest within the existing P. gingivalis genomes available in NCBI GenBank. WGS of all species were analyzed against the Comprehensive Antibiotic Resistance Database (https://card.mcmaster.ca/analyze) (8) to identify known and putative antimicrobial resistance genes. Two "perfect hits" were obtained, both in P. intermedia strain 885, against the cfxA2 gene; this broad spectrum ␤-lactamase has been reported in several Prevotella spp. (9). Analysis of flanking sequence revealed the presence of a Tn4555-like sequence, from Bacteroides fragilis, suggesting horizontal acquisition (10). PHASTER (PHAge Search Tool Enhanced Release) (11) analysis of all WGS found just a single intact bacteriophage (33.8 kbp in length, with a GϩC content of 48.78%, and encoding 36 proteins) in P. gingivalis WW2952.
Phase-variable type I restriction-modification systems (pv-RMS) were found in all of the P. intermedia genomes and in five of the P. gingivalis genomes (Table 1); similar pv-RMS were subsequently identified in P. intermedia and P. gingivalis genomes already in the GenBank database. A pv-RMS system found in Streptococcus pneumoniae has recently been shown to facilitate the epigenetic control of genes involved in virulence (12,13). Structural similarities between the S. pneumoniae system and the pv-RMSs identified in P. intermedia and P. gingivalis raise the possibility that epigenetic regulatory mechanisms may also play a role in periodontal disease.
Accession number(s). These whole-genome shotgun sequences have been deposited in GenBank and the versions described in this paper are the first versions (see Table 1 for full details).
Illumina sequencing was performed by the NUCLEUS Genomics Core Facility and data analysis used the Spectre2 and Alice2 High Performance Computing Facility at the University of Leicester.
This work was in part funded by a grant from the BBSRC (BB/N002903/1) to M.R.O.