Genomic Sequencing of Orientia tsutsugamushi Strain Karp, an Assembly Comparable to the Genome Size of the Strain Ikeda

Orientia tsutsugamushi, an intracellular bacterium, belongs to the family Rickettsiaceae. This study presents the draft genome sequence of strain Karp, with 2.0 Mb as the size of the completed genome. This nearly finished draft genome sequence was annotated with the RAST server and the contents compared to those of the other strains.

causing scrub typhus. The disease in endemic triangle influences more than one billion people in the world (1). The genome size among the strains is varied; only two completed genomes, those of strain Boryong and Ikeda, have been published (2,3). The genome of Orientia tsutsugamushi contains near 40% identical repeats, which troubles the sequence assembly. In this study, we combined a cloned-based method and high-throughput sequencing to assemble the draft genome of strain Karp, which was originally isolated from Papua New Guinea (4,5).
Clone-based whole-genome shotgun sequencing generated the reads by sequencing bacterial artificial chromosome (BAC) clones with the Sanger method. The assembled reads formed 116 contigs before high-throughput whole-genome sequencing (HTGS) was available; the contig length ranged from 1,153 to 151,627 bp. The raw reads of HTGS were used to join the contigs from the Sanger read assembly, and the original 116 contigs were reduced to 99. HTGS was conducted using the Illumina MiSeq platform with the 2 ϫ 250 bp paired-end mode. The trimmed raw reads were de novo assembled into an original 1,011 contigs using CLC Genomics Workbench 9.0, with an average coverage of 73ϫ. The contigs from HTGS were first mapped to three references: (i) the contig set from the Sanger assembly, (ii) strain Boryong (accession no. NC_009488.1), and (iii) strain Ikeda (accession no. NC_010793.1), and subjected to a BLAST search against the nt/nr database to eliminate contigs from background DNA contaminations. The raw reads retrieved from filtered contigs were de novo assembled and mapped again to the three references, plus a published Karp draft genome (GenBank accession number NZ_LANM00000000.1) for the secondary background cleaning. All retained HTGS contigs of strain Karp can be mapped to at least one of the references, and five contigs from HTGS (combined length, 11,304 bp) were added to the original 99 Sanger contigs (2,011,605 bp) to form the draft genome of 2,022,909 bp (30.41% GϩC con-tent). The 104 (99 ϩ 5) contigs of Karp were aligned to the Ikeda complete genome using the CONTIGuator software (6).
The Karp genome was annotated using RAST server version 2.0 (7), with 2,089 coding sequences (CDSs), 1,090 transcribed from the positive strand, and 999 transcribed from the negative strand; 2,052 of these were categorized into 185 subsystems, and 37 were RNAs. A SEED Viewer sequence comparison (8) showed that 1,604 genes of strain Boryong and 1,951 genes of strain Ikeda could be found in the Karp draft genome. The HTGS was also conducted to sequence the other two O. tsutsugamushi strains, AFSC4 and AFSC7, without the information about the repetitive sequences assembled from cloned-based sequencing. The SEED Viewer functional comparison (8) revealed that two genes presenting in both AFSC4 and AFSC7 were absent in Karp, even though it possesses more CDSs in its genome (data not shown). Previous studies showed that strain Karp was sensitive to antibiotics, whereas AFSC4 was insensitive (9), and AFSC7 had similar internal observations. A completed draft genome of AFSC4 and AFSC7 for delicate comparison with the Karp genome may provide the possible targets for investigation of microbial drug resistance mechanism(s).
Accession number(s). The first version of draft genome sequences of Orientia tsutsugamushi strain Karp was deposited in GenBank under accession no. LYMA00000000.

ACKNOWLEDGMENTS
We thank Gregory Dasch for purifying the Karp, AFSC4, and AFSC7 strains of Orientia tsutsugamushi from L929 cells and Zhiwen Zhang and Tatyana Belinskaya for subsequent DNA extraction for WGS.
This work was supported in part by Work Unit Number (WUN) 6000.RAD1.J.A0310 of the Naval Medical Research Center.
The opinions and assertions contained herein are the private ones of the authors and are not to be construed as official or as reflecting the views of the Department of the Navy, the Naval service at large, the Department of Defense, or the U.S. Government. crossmark