Complete Genome Sequences of Eight Human Papillomavirus Type 16 Asian American and European Variant Isolates from Cervical Biopsies and Lesions in Indian Women

Human papillomavirus type 16 (HPV16), a member of the Papillomaviridae family, is the primary etiological agent of cervical cancer. Here, we report the complete genome sequences of four HPV16 Asian American variants and four European variants, isolated from cervical biopsies and scrapings in India.

In India, HPV16 has been found to be the most prevalent highrisk type associated with CaCx cases (6), and, in an earlier report from our laboratory, the presence of AA variants was confirmed for the first time along with E variants using Sanger sequencing (7)(8)(9).
Here, we report the complete genome sequences of four viral isolates belonging to the AA variant lineage and four viral isolates belonging to the E variant lineage, isolated from cervical biopsies and scrapings in India.
DNA was isolated from cervical specimens using the Qiagen DNA minikit (Qiagen, Germany) according to the manufacturer's instructions. HPV screening was carried out using broad range GP5ϩ/GP6ϩ primer pairs, and the presence of HPV16 was confirmed by quantitative E6 PCR (10). Viral genomes were enriched using a 100-ng DNA template, two long-range overlapping primer sets, and an Expand Long Template PCR enzyme mix (Roche, Switzerland). The amplicons were purified, quantitated, and mixed in equimolar proportions to generate 1 g of starting material. The pooled amplicons were subsequently sheared to low molecular weight fragments. Adapter ligation was carried out using an Ion Plus fragment library kit (Thermo Fisher Scientific, USA) and each library was labeled using Ion Xpress bar code adapters (Thermo Fisher Scientific, USA). The ligated libraries were size-selected using E᎑Gel SizeSelect 2% agarose gels (Thermo Fisher Scientific, USA) and assessed on an Agilent 2100 Bioanalyzer (Agilent Technologies, Germany). The libraries were quantified using the Ion Library TaqMan quantitation kit (Thermo Fisher Scientific, USA) and the bar-coded library pools were amplified onto Ion Sphere particles by emulsion PCR. Highthroughput sequencing was performed on an Ion PGM sequencer platform (Thermo Fisher Scientific, USA), and the Torrent Suite version 3.0 data processing pipeline was used to generate sequence reads. Per genome, approximately 2,300 paired-end reads with an average insert length of 200 bp were generated (~46,000 bases/ genome). De novo assembly was carried out to form consensus sequences using the Geneious version 7.0.3 assembler (11). Whole-genome Sanger sequences, generated independently using short range primers sets (8) from each specimen, were aligned to respective consensus sequences for confirmation. Each consensus sequence was manually checked to identify variant lineages on the basis of differences in the L1 region and whole genome BLAST search. The curated genome sequences were annotated with genome annotation transfer utility software (12) using HPV16 E or AA reference sequences (NC_001526.2 and AB818689) as templates. The annotated genome sequences thus generated were further validated manually. Nucleotide sequence accession numbers. Whole-genome sequences of all eight viral isolates have been deposited in GenBank using NCBI's BankIt tool, and the accession numbers are listed in Table 1.

ACKNOWLEDGMENTS
We thank the College of Medicine and Jawaharlal Nehru Medical Hospital (Kalyani, Nadia, West Bengal, India) for their support in sample collection and the CoTeRI, National Institute of Biomedical Genomics, Kalyani, India, for their technical support for high-throughput sequencing.
This work was partially supported by funding from the Department of Biotechnology (BT/PR8014/Med/14/1220/2006), Government of India, and from the National Institute of Biomedical Genomics, Kalyani (Intramural Grant) to S. Sengupta. B. Bhattacharjee is supported by Ramanujan Fellowship from the Department of Science and Technology, India. P. Mandal and S. Sen were supported by fellowships from the Council of Scientific and Industrial Research, India, and the University Grant Commission, India, respectively, while working on this project.