mapDamage(1)

tracking and quantifying damage patterns in ancient DNA sequences

Section 1 mapdamage bookworm source

Description

MAPDAMAGE

NAME

mapDamage - tracking and quantifying damage patterns in ancient DNA sequences

SYNOPSIS

mapDamage [options] -i BAMfile -r reference.fasta

DESCRIPTION

MapDamage is a computational framework written in Python and R, which tracks and quantifies DNA damage patterns among ancient DNA sequencing reads generated by Next-Generation Sequencing platforms.

OPTIONS

--version

show program’s version number and exit

-h, --help

show this help message and exit

Input files:

-i FILENAME, --input=FILENAME

SAM/BAM file, must contain a valid header, use ’-’ for reading a BAM from stdin

-r REF, --reference=REF

Reference file in FASTA format

General options:

-n DOWNSAMPLE, --downsample=DOWNSAMPLE

Downsample to a randomly selected fraction of the reads (if 0 < DOWNSAMPLE < 1), or a fixed number of randomly selected reads (if DOWNSAMPLE >= 1). By default, no downsampling is performed.

--downsample-seed=DOWNSAMPLE_SEED

Seed value to use for downsampling. See documentation for py module ’random’ for default behavior.

--merge-reference-sequences

Ignore referece sequence names when tabulating reads (using ’*’ instead). Useful for alignments with a large number of reference sequnces, which may otherwise result in excessive memory or disk usage due to the number of tables generated.

-l LENGTH, --length=LENGTH

read length, in nucleotides to consider [70]

-a AROUND, --around=AROUND

nucleotides to retrieve before/after reads [10]

-Q MINQUAL, --min-basequal=MINQUAL

minimum base quality Phred score considered, Phred-33 assumed [0]

-d FOLDER, --folder=FOLDER

folder name to store results [results_FILENAME]

-f, --fasta

Write alignments in a FASTA file

--plot-only

Run only plotting from a valid result folder

-q, --quiet

Disable any output to stdout

-v, --verbose

Display progression information during parsing

--mapdamage-modules=MAPDAMAGE_MODULES

Override the system wide installed mapDamage module

Options for graphics:

-y YMAX, --ymax=YMAX

graphical y-axis limit for nucleotide misincorporation frequencies [0.3]

-m READPLOT, --readplot=READPLOT

read length, in nucleotides, considered for plotting nucleotide misincorporations [25]

-b REFPLOT, --refplot=REFPLOT

the number of reference nucleotides to consider for plotting base composition in the region located upstream and downstream of every read [10]

-t TITLE, --title=TITLE

title used for plots []

Options for the statistical estimation:

--rand=RAND

Number of random starting points for the likelihood optimization [30]

--burn=BURN

Number of burnin iterations [10000]

--adjust=ADJUST

Number of adjust proposal variance parameters iterations [10]

--iter=ITER

Number of final MCMC iterations [50000]

--forward

Using only the 5’ end of the seqs [False]

--reverse

Using only the 3’ end of the seqs [False]

--var-disp

Variable dispersion in the overhangs [False]

--jukes-cantor

Use Jukes Cantor instead of HKY85 [False]

--diff-hangs

The overhangs are different for 5’ and 3’ [False]

--fix-nicks

Fix the nick frequency vector (Only C.T from the 5’ end and G.A from the 3’ end) [False]

--use-raw-nick-freq

Use the raw nick frequency vector without smoothing [False]

--single-stranded

Single stranded protocol [False]

--theme-bw

Use black and white theme in post. pred. plot [False]

--seq-length=SEQ_LENGTH

How long sequence to use from each side [12]

--stats-only

Run only statistical estimation from a valid result folder

--rescale

Rescale the quality scores in the BAM file using the output from the statistical estimation

--rescale-only

Run only rescaling from a valid result folder

--rescale-out=RESCALE_OUT

Write the rescaled BAM to this file

--no-stats

Disabled statistical estimation, active by default

--check-R-packages

Check if the R modules are working

BUGS

Report bugs to aginolhac@snm.ku.dk, MSchubert@snm.ku.dk or jonsson.hakon@gmail.com

AUTHOR

This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.