Complete Genome Sequences of Five Zika Virus Isolates.

Zika virus is an emerging human pathogen of great concern due to putative links to microcephaly and Guillain-Barre syndrome. Here, we report the complete genomes, including the 5′ and 3′ untranslated regions, of five Zika virus isolates, one from the Asian lineage and four from the African lineage.

concern due to putative links to severe birth defects (microcephaly [1]) and a rare autoimmune syndrome (Guillain-Barre [2]). The virus was first detected in a sentinel rhesus macaque in 1947 while monitoring for enzootic yellow fever virus circulation in the Zika Forest, Uganda (3). Since then, ZIKV has been detected throughout much of Sub-Saharan Africa and has spread across the Pacific Ocean to Central and South America via Southeast Asia (4,5).
ZIKV is an arbovirus believed to be vectored mainly by mosquitoes in the genus Aedes. ZIKV belongs to the Spondweni serocomplex within the genus Flavivirus and the family Flaviviridae. Viruses in the genus Flavivirus have single-stranded positive-sense RNA genomes of~11 kb in size (6). These genomes encode a single polyprotein flanked by 5= and 3= untranslated regions (UTRs).
Here, we report five complete ZIKV genomes, including the full polyprotein and the 5= and 3= UTRs. Four of these viruses were isolated in Africa (Senegal and Uganda) and belong to the African lineage (4). One virus (FSS13025) was isolated in Cambodia and belongs to the Asian lineage (4). All five isolates were passaged multiple times in cell culture (Table 1). Several distinctly passaged versions of strain MR-766 have been previously published, includ-ing two complete genomes (GenBank: AY632535, LC002520). A coding-complete version (i.e., missing the 5= and 3= UTRs [7]) of the genome was published for an earlier passage of FSS13025 (GenBank: JN860885).
Isolates were sequenced at the U.S. Army Medical Research Institute of Infectious Diseases and the University of Texas Medical Branch using Illumina NextSeq/MiSeq. RNA was extracted using the Direct-zol RNA extraction kit (Zymo), converted to cDNA using SuperScript III (Invitrogen) and amplified using sequence-independent single-primer amplification (8) combined with primers for rapid amplification of cDNA ends (9). Adaptors and primers were clipped using Cutadapt version 1.21 (10), and low-quality reads/bases were filtered using Prinseq-lite version 0.20.4 (11). Reads were aligned to a reference genome (chosen separately for each isolate based on nucleotide-level similarity between preliminary de novo contigs and publically available ZIKV genomes) using Bowtie2 (12), duplicates were removed with Picard (http://broadinstitute.github.io/picard), and a new consensus was generated using a combination of SAMtools version 0.1.18 (13) and custom scripts. When necessary, the 5= and 3= UTRs were then built out beyond the reference genome using custom scripts that iterate the reference assembly process with the addition of ambiguous characters (N's) onto the 5= and 3= ends of the reference. The assembled, complete genomes (7) ranged in size from 10,795 to 10,807 nucleotides; the 5= UTRs were 106 to 107 nucleotides and the 3= UTRs were 428 to 429 nucleotides, respectively. The MR-766 stock we sequenced contained a 12-nucleotide inframe deletion in the polyprotein, which it shared with AY632535, but is not present in LC002520. However, our sequence does not share the five additional single nucleotide insertions/deletions present in AY632535. All sequenced isolates contained the conserved, dinucleotide complimentary terminal sequences that are characteristic of the Flavivirus genus (5= -AG . . . CU -3= [6]).
Nucleotide sequence accession numbers. Genome accession numbers to public databases are listed in Table 1.

ACKNOWLEDGMENTS
We thank Bradley Pfeffer for his help submitting the genomes to Gen-Bank. Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the U.S. Army. The views expressed in this article are those of the authors and do not necessarily reflect the official policy or position of the Department of the Navy, Department of Defense, or the U.S. Government. Some authors of this work are military service members or employees of the U.S. Government. This work was prepared as part of their official duties.
The work at USAMRIID was funded by DTRA project CB10246. Some of the genomes referenced in this publication were also sequenced through funding provided by DHS S&T through contract no. HSHQDC-13-C-B0009 "Capturing Global Biodiversity of Pathogens by Whole Genome Sequencing."