Draft Genome Sequence of Dermatophagoides pteronyssinus, the European House Dust Mite

ABSTRACT Dermatophagoides pteronyssinus is the European dust mite and a major source of human allergens. Here, we present the first draft genome sequence of the mite, as well as the ab initio gene prediction and functional analyses that will facilitate comparative genomic analyses with other mite species.

D ermatophagoides pteronyssinus, the European house dust mite, belongs to the family Pyroglyphidae, of which more than 6,100 species containing both freeliving and parasitic lineages have been described (1). House dust mites live in close association with vertebrates and utilize powerful enzymes to digest organic debris that vertebrates leave behind. Many of these enzymes, secreted in the feces, are major sources of allergens and lead to sensitization in 15 to 20% of the population in industrialized countries, through activation of both innate and adaptive immune responses (2).
For genomic sequencing, we cultured D. pteronyssinus (Airmid Healthgroup Ltd., Ireland) for 28 days on house dust mite maximal medium at 75% relative humidity and 25°C. A multi-isolate sample of D. pteronyssinus was collected and separated from culture medium by sieving, followed by saturated saline separation. Mites were washed and subjected to 24-h starvation before being sterilized using 70% ethanol; they were then washed and frozen in liquid nitrogen prior to DNA extraction, which was performed using a Promega genomic DNA purification kit and mouse-tail method. DNA was quantified using a Qubit dsDNA BR assay kit and examined for integrity by agarose gel electrophoresis.
Four sequencing libraries-500-bp paired-end (PE), 2-kb mate-pair (MP), 5-kb MP, and 10-kb MP-were prepared for the Illumina HiSeq 2000, 2500, and 4000 platforms (BGI, China) with PE read sizes of 100 bp and MP read sizes of 49 bp. A total of 130,978,913 PE reads, 143,286,220 2-kb MP reads, 56,245,986 5-kb MP reads, and 29,806,232 10-kb MP reads were first trimmed for adapter and base call quality with Trimmomatic (3) before being used for de novo assembly in dipSPAdes version 1.0 (4), which resulted in 4,459 contigs with an N 50 of 68,101 bp. Scaffolds were generated using SSPACE (5), and gaps were closed using GapFiller (6). The final assembly resulted in 1,322 scaffolds, with an N 50 value 450,436 bp, an L 50 of 33 scaffolds, and a GC content of 30.93%. The largest scaffold was 3,593,316 bp in length. We estimated the genome size of D. pteronyssinus to be approximately 70.76 Mb with a total assembly gap length of 3.14%.
The ab initio gene prediction discovered 12,530 gene models containing 48,371 exons in total. We identified 419 of the 429 CEGMA eukaryotic core genes. We also located full-length sequences for 39 known mite allergens, including the mite group allergens 1 to 11, 13 to 16, 18, and 20 to 33 (8,14). Functional annotation resulted in gene ontology terms for 5,622 genes and Pfam domains for 8,031 proteins; 1,619 proteins are predicted to have a signal peptide, and 3,610 contain a transmembrane domain.
Accession number(s). This whole-genome shotgun project has been deposited at DDBJ/ENA/GenBank under the accession number MQNO00000000. The version described in this paper is the second version, MQNO02000000.