porechop(1)

adapter trimmer for Oxford Nanopore reads

Section 1 porechop bookworm source

Description

PORECHOP

NAME

porechop - adapter trimmer for Oxford Nanopore reads

SYNOPSIS

porechop [-h] -i INPUT [-o OUTPUT] [--format {auto,fasta,fastq,fasta.gz,fastq.gz}] [-v VERBOSITY] [-t THREADS] [--version] [-b BARCODE_DIR] [--barcode_threshold BARCODE_THRESHOLD] [--barcode_diff BARCODE_DIFF] [--require_two_barcodes] [--untrimmed] [--discard_unassigned] [--adapter_threshold ADAPTER_THRESHOLD] [--check_reads CHECK_READS] [--scoring_scheme SCORING_SCHEME] [--end_size END_SIZE] [--min_trim_size MIN_TRIM_SIZE] [--extra_end_trim EXTRA_END_TRIM] [--end_threshold END_THRESHOLD] [--no_split] [--discard_middle] [--middle_threshold MIDDLE_THRESHOLD] [--extra_middle_trim_good_side EXTRA_MIDDLE_TRIM_GOOD_SIDE] [--extra_middle_trim_bad_side EXTRA_MIDDLE_TRIM_BAD_SIDE] [--min_split_read_size MIN_SPLIT_READ_SIZE]

DESCRIPTION

Porechop: a tool for finding adapters in Oxford Nanopore reads, trimming them from the ends and splitting reads with internal adapters

OPTIONS

optional arguments:

-h, --help

show this help message and exit

Main options:

-i INPUT, --input INPUT

FASTA/FASTQ of input reads or a directory which will be recursively searched for FASTQ files (required)

-o OUTPUT, --output OUTPUT

Filename for FASTA or FASTQ of trimmed reads (if not set, trimmed reads will be printed to stdout)

--format {auto,fasta,fastq,fasta.gz,fastq.gz}

Output format for the reads - if auto, the format will be chosen based on the output filename or the input read format (default: auto)

-v VERBOSITY, --verbosity VERBOSITY

Level of progress information: 0 = none, 1 = some, 2 = lots, 3 = full - output will go to stdout if reads are saved to a file and stderr if reads are printed to stdout (default: 1)

-t THREADS, --threads THREADS

Number of threads to use for adapter alignment (default: 4)

--version

show program’s version number and exit

Barcode binning settings:

Control the binning of reads based on barcodes (i.e. barcode demultiplexing)

-b BARCODE_DIR, --barcode_dir BARCODE_DIR

Reads will be binned based on their barcode and saved to separate files in this directory (incompatible with --output)

--barcode_threshold BARCODE_THRESHOLD

A read must have at least this percent identity to a barcode to be binned (default: 75.0)

--barcode_diff BARCODE_DIFF

If the difference between a read’s best barcode identity and its second-best barcode identity is less than this value, it will not be put in a barcode bin (to exclude cases which are too close to call) (default: 5.0)

--require_two_barcodes

Reads will only be put in barcode bins if they have a strong match for the barcode on both their start and end (default: a read can be binned with a match at its start or end)

--untrimmed

Bin reads but do not trim them (appropriate if reads are to be used with Nanopolish) (default: trim the reads)

--discard_unassigned

Discard unassigned reads (instead of creating a "none" bin) (default: False)

Adapter search settings:

Control how the program determines which adapter sets are present

--adapter_threshold ADAPTER_THRESHOLD

An adapter set has to have at least this percent identity to be labelled as present and trimmed off (0 to 100) (default: 90.0)

--check_reads CHECK_READS

This many reads will be aligned to all possible adapters to determine which adapter sets are present (default: 10000)

--scoring_scheme SCORING_SCHEME

Comma-delimited string of alignment scores: match, mismatch, gap open, gap extend (default: 3,-6,-5,-2)

End adapter settings:

Control the trimming of adapters from read ends

--end_size END_SIZE

The number of base pairs at each end of the read which will be searched for adapter sequences (default: 150)

--min_trim_size MIN_TRIM_SIZE

Adapter alignments smaller than this will be ignored (default: 4)

--extra_end_trim EXTRA_END_TRIM

This many additional bases will be removed next to adapters found at the ends of reads (default: 2)

--end_threshold END_THRESHOLD

Adapters at the ends of reads must have at least this percent identity to be removed (0 to 100) (default: 75.0)

Middle adapter settings:

Control the splitting of read from middle adapters

--no_split

Skip splitting reads based on middle adapters (default: split reads when an adapter is found in the middle)

--discard_middle

Reads with middle adapters will be discarded (default: reads with middle adapters are split) (this option is on by default when outputting reads into barcode bins)

--middle_threshold MIDDLE_THRESHOLD

Adapters in the middle of reads must have at least this percent identity to be found (0 to 100) (default: 85.0)

--extra_middle_trim_good_side EXTRA_MIDDLE_TRIM_GOOD_SIDE

This many additional bases will be removed next to middle adapters on their "good" side (default: 10)

--extra_middle_trim_bad_side EXTRA_MIDDLE_TRIM_BAD_SIDE

This many additional bases will be removed next to middle adapters on their "bad" side (default: 100)

--min_split_read_size MIN_SPLIT_READ_SIZE

Post-split read pieces smaller than this many base pairs will not be outputted (default: 1000)

AUTHOR

This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.