De Novo Whole-Genome Sequence and Genome Annotation of Lichtheimia ramosa

We report the annotated draft genome sequence of Lichtheimia ramosa (JMRC FSU:6197). It has been reported to be a causative organism of mucormycosis, a rare but rapidly progressive infection in immunocompromised humans. The functionally annotated genomic sequence consists of 74 scaffolds with a total number of 11,510 genes.

DNA was obtained from mycelia cultured in liquid supplemented minimal medium (SUP medium) under shaken conditions for 3 days at 37°C (10). One library was prepared for 8-kb Roche/454PE GS FLXϩ Titanium sequencing and a second library for Illumina HiSeq 2000 100-bp PE sequencing. Genome sequencing and assembly was generated by LGC Genomics (Berlin) using a hybrid approach. Illumina contigs, assembled by Velvet (11), and 454 scaffolds, assembled by Newbler 2.6 (454 Life Sciences), were merged using Minimus2 (12). The resulting scaffolds were finalized using SOAP GapCloser (13) and SEQuel (14). RNA-Seq data were obtained from a pooled sample cultured under five different conditions. Transcriptome sequencing was performed using Roche/454 GS FLXϩ Titanium, and contigs were assembled using Newbler.
For gene prediction, the pipeline presented by Haas et al. (15) was customized, and tools incorporating ab initio models, transcriptome data, and protein alignments were applied. The parameter sets were trained using gene models that were predicted by TransDecoder (16) from aligned species-specific transcripts. All gene predictions were combined using EVidenceModeler. Untranscribed regions were added using PASA (17).
Genes were functionally annotated using Blast2GO (25) and InterproScan (26), including the TMHMM (27) option. Gene descriptions were obtained by blasting the predicted protein sequences against the fungal UniProt Knowledgebase (28). Secondary metabolite gene clusters were predicted using SMURF (29 (11.19 Mbp; estimated transcriptome coverage, 0.5-fold). The final gene prediction consists of 11,510 genes and 11,546 transcripts, and 452 (98.7%) eukaryotic core proteins were identified using CEGMA (30). The coding density of the genome is 52%. Functional names were assigned to 980 transcripts, gene ontology categories to 6,899 transcripts, and protein domains to 9,664 translated transcripts; 2,645 transcripts were predicted to contain transmembrane domains, and 38 transcripts have been assigned to three secondary metabolite gene clusters.
Nucleotide sequence accession numbers. This whole-genome shotgun project has been deposited in DDBJ/ENA/GenBank under the accession numbers LK023313 to LK023386. The version described in this paper is the first version. Genome data and additional information are also available at the HKI (Hans-Knöll-Institute) Genome Resource (http://www.genome-resource.de).

ACKNOWLEDGMENTS
Funding was provided by the Era-Net PathoGenoMics Project "OXYstress: Human fungal pathogens under oxygen stress-adaptive mechanisms to hypoxia and reactive oxygen species and their consequences for host interaction and therapy," funded by the Austrian FWF, I661-B09 to C.L.F. J.L. was supported by the Deutsche Forschungsgemeinschaft crossmark Genome Announcements September/October 2014 Volume 2 Issue 5 e00888-14 genomea.asm.org 1