mCaller(1)

find methylation in nanopore reads

Section 1 mcaller bookworm source

Description

MCALLER.PY

NAME

mCaller.py - find methylation in nanopore reads

DESCRIPTION

/usr/lib/python3/dist-packages/joblib/_multiprocessing_helpers.py:53: UserWarning: [Errno 13] Permission denied. joblib will operate in serial mode
warnings.warn(’%s.

joblib will operate in serial mode’ % (e,))

usage: mCaller [-h] (-p POSITIONS | -m MOTIF) -r REFERENCE -e TSV -f FASTQ

[-t THREADS] [-b BASE] [-n NUM_VARIABLES] [--train] [--training_tsv TRAINING_TSV] [-d MODELFILE] [-s SKIP_THRESH] [-q QUAL_THRESH] [-c CLASSIFIER] [--plot_training] [-v]

Classify bases as methylated or unmethylated

optional arguments:

-h, --help

show this help message and exit

-p POSITIONS, --positions POSITIONS

file with a list of positions at which to classify bases (must be formatted as space- or tab-separated file with chromosome, position, strand, and label if training)

-m MOTIF, --motif MOTIF

classify every base of type --base in the motif specified instead (can be single one-mer)

-r REFERENCE, --reference REFERENCE

fasta file with reference aligned to

-e TSV, --tsv TSV

tsv file with nanopolish event alignment

-f FASTQ, --fastq FASTQ

fastq file with nanopore reads

-t THREADS, --threads THREADS

specify number of processes (default = 1)

-b BASE, --base BASE

bases to classify as methylated or unmethylated (A or C, default A)

-n NUM_VARIABLES, --num_variables NUM_VARIABLES

change the length of the context used to classify (default of 6 variables corresponds to 11-mer context (6*2-1))

--train

train a new model (requires labels in positions file)

--training_tsv TRAINING_TSV

mCaller output file for training

-d MODELFILE, --modelfile MODELFILE

model file name

-s SKIP_THRESH, --skip_thresh SKIP_THRESH

number of skips to allow within an observation (default 0)

-q QUAL_THRESH, --qual_thresh QUAL_THRESH

quality threshold for reads (default none)

-c CLASSIFIER, --classifier CLASSIFIER

use alternative classifier: options = NN (default), RF, LR, or NBC (non-default may significantly increase runtime)

--plot_training

plot probabilities distributions for training positions (requires labels in positions file and --train)

-v, --version

print version