Skip to main content

Table 3 The descriptions of raw short-read sequences used in the evaluation experiments

From: Software for pre-processing Illumina next-generation sequencing short read sequences

DataSet

Caenorhabditis elegans

Saccharomyces cerevisiae S288c

Escherichia coli O157 H7

Taxonomy ID

6239

559292

83334

Reference Genome size (bp)

100.3 M

12.2 M

5.5 M

#Chromosomes

7*

17*

1

SRA run

SRR065390

SRR449310

SRR957847

Platform

Illumina Genome Analyzer II

Illumina HiSeq 2000

Illumina MiSeq

Strategy

WGS

WGS

WGS

Source

Genomic

Genomic

Genomic

Layout

Paired

Paired

Paired

Read length

100

76

150

Nominal length

356

230

350

Total sequences (paired)

33,808,546

1,898,259

2,241,778

Total bases (paired)

6,761,709,200

288,535,368

672,533,400

Mean Phred quality score

29.49

34.17

33.12

Low Phred quality score (<=10)

1,902,576 (2.81%)

167,669 (4.42%)

76,598 (1.71%)

Coverage

67.4x

23.7x

122.3x

GC content (%)

35

39

50

  1. *The mitochondrial chromosome is included.