Candidate genome assembliesa

Assembly softwareSoftware versionPolishing procedureLength of assembly (Mbp)No. of contigsNo. of contigs >50 kbpLongest contigN50 valueNo. of mismatches per 100 kbpNo. of indels per 100 kbp
SPAdes (18),224137640 kbp173 kbp
Canu (19)1.7.1Pilon (2×)24.926244.2 Mbp1.7 Mbp41.713.9
MaSuRCA (20)3.2.8Pilon (1×)25.429236.8 Mbp2.7 Mbp45.77.6
Miniasm (21)/minimap2 (22)0.3/2.12Racon (2×)24.515133.7 Mbp2.8 Mbp81.5367.9
DBG2OLC (23)/platanus (24)1.2.4Racon (1×), Pilon (1×)25.831244.2 Mbp1.8 Mbp55.235.1
FinalRacon (2×), Pilon (2×)24.412113.8 Mbp2.8 Mbp45.611.0
  • a Statistics were produced with Quast v. 4.5 (15). To estimate mismatches and indels, SPAdes assembly based on Illumina short reads was used as a reference. With SPAdes, the result was filtered for length >100 and coverage >10. Canu assembly used only reads overlapping SPAdes by >200 bp, and we filtered out contigs supported by fewer than 5 reads. All assemblies were polished with Pilon v. 1.21 (16) and Racon v. 1.3.1 (17). Most of the size differences between candidate assemblies can be accounted for by mtDNA and rRNA gene fragments as well as other repetitive sequences.