Skip to main content

Table 2 Short read sequence pre-processing algorithms in ngsShoRT

From: Software for pre-processing Illumina next-generation sequencing short read sequences

Category

Algorithm/Method

Description

Sequencing Artifacts Removal

5adpt

Detects (using exact or approximate matching) sequencing artifacts listed in an input file and removes them.

rmHP

Removes homopolymer sequences.

QSEQ Specific Methods

qseq0

Removes QSEQ reads with “Failed_Chastity” filter flags.

qseqB

Removes reads with more than certain number of "B"-scored bases.

Reads with “N” Bases Removal/Splitting

nperc

Filters out reads with un-called “N” bases exceeding a percentage cutoff.

ncutoff

Filters out reads with un-called “N” bases exceeding a number cutoff.

nsplit

Searches and removes “N” bases, then splits the read around the removed “N” bases into two smaller daughter reads.

Quality Score Based Trimming

LQR

Removes “low quality” reads using quality score cutoff or percent cutoff.

Mott

Quality-window extraction (trim both the 5'- and 3'-ends of a read).

TERA

Trims low quality-score bases from the 3'-ends of reads based on their running average quality scores.

5'/3'-end Bases Trimming

3end

Trims bases from 3'-end of a read.

5end

Trims bases from 5'-end of a read.