scythe(1)
Bayesian adaptor trimmer
Description
SCYTHE
NAME
scythe - Bayesian adaptor trimmer
SYNOPSIS
scythe -t sanger -a /path/to/adaptors.fasta [options] <sequences.fastq.gz>
Trim 3´-end adaptor contaminants off sequence files. If no output file is specified, scythe will use stdout.
OPTIONS
-p, --prior
prior (default: 0.300)
-q, --quality-type quality type, either illumina, solexa, or
sanger (default: sanger)
-m, --matches-file matches file (default: no output)
-o, --output-file output trimmed sequences file (default:
stdout)
-t, --tag add a tag to the header indicating Scythe cut a
sequence (default: off)
-n, --min-match smallest contaminant to consider (default:
5)
-M, --min-keep filter sequences less than or equal to this
length (default: 35)
--quiet don´t output statistics about trimming to
stdout (default: off)
--help display this help and exit
--version output version information and exit
These are the quality encoding schemes scythe recognises (see ´--quality´)
phred PHRED
quality scores (e.g. from Roche 454). ASCII with no
offset, range: [4, 60].
sanger Sanger are PHRED ASCII qualities with an offset of
33,
range: [0, 93]. From NCBI SRA, or Illumina pipeline 1.8+.
solexa Solexa (also very early Illumina -- pipeline <
1.3).
ASCII offset of 64, range: [-5, 62]. Uses a different
quality-to-probabilities conversion than other schemes.
illumina Illumina output from pipeline versions between 1.3
and 1.7.
ASCII offset of 64, range: [0, 62]
FILES
adaptors.fasta:
Provide contaminant sequences as a fasta-formatted file.
See
´/usr/share/doc/scythe/illumina_adaptors.fa´.
N.B.: Index/Barcode sequences should be substituted for Ns
in
the example adaptor file.
AUTHOR
Vince Buffalo, https://github.com/vsbuffalo