cnvkit-batch(1)

Run the complete CNVkit pipeline on one or more BAM files.

Section 1 cnvkit bookworm source

Description

CNVKIT_BATCH

NAME

cnvkit_batch - Run the complete CNVkit pipeline on one or more BAM files.

DESCRIPTION

usage: cnvkit batch [-h] [-m {hybrid,amplicon,wgs}]
[--segment-method {cbs,flasso,haar,none,hmm,hmm-tumor,hmm-germline}]

[-y] [-c] [--drop-low-coverage] [-p [PROCESSES]] [--rscript-path PATH] [-n [FILES ...]] [-f FILENAME] [-t FILENAME] [-a FILENAME] [--annotate FILENAME] [--short-names] [--target-avg-size TARGET_AVG_SIZE] [-g FILENAME] [--antitarget-avg-size ANTITARGET_AVG_SIZE] [--antitarget-min-size ANTITARGET_MIN_SIZE] [--output-reference FILENAME] [--cluster] [-r REFERENCE] [-d DIRECTORY] [--scatter] [--diagram] [bam_files ...]

positional arguments:

bam_files

Mapped sequence reads (.bam)

options:

-h, --help

show this help message and exit

-m {hybrid,amplicon,wgs}, --seq-method {hybrid,amplicon,wgs}, --method
{hybrid,amplicon,wgs}

Sequencing assay type: hybridization capture (’hybrid’), targeted amplicon sequencing (’amplicon’), or whole genome sequencing (’wgs’). Determines whether and how to use antitarget bins. [Default: hybrid]

--segment-method {cbs,flasso,haar,none,hmm,hmm-tumor,hmm-germline}

Method used in the ’segment’ step. [Default: cbs]

-y, --male-reference, --haploid-x-reference

Use or assume a male reference (i.e. female samples will have +1 log-CNR of chrX; otherwise male samples would have -1 chrX).

-c, --count-reads

Get read depths by counting read midpoints within each bin. (An alternative algorithm).

--drop-low-coverage

Drop very-low-coverage bins before segmentation to avoid false-positive deletions in poor-quality tumor samples.

-p [PROCESSES], --processes [PROCESSES]

Number of subprocesses used to running each of the BAM files in parallel. Without an argument, use the maximum number of available CPUs. [Default: process each BAM in serial]

--rscript-path PATH

Path to the Rscript executable to use for running R code. Use this option to specify a non-default R installation. [Default: Rscript]

To construct a new copy number reference:

-n [FILES ...], --normal [FILES ...]

Normal samples (.bam) used to construct the pooled, paired, or flat reference. If this option is used but no filenames are given, a "flat" reference will be built. Otherwise, all filenames following this option will be used.

-f FILENAME, --fasta FILENAME

Reference genome, FASTA format (e.g. UCSC hg19.fa)

-t FILENAME, --targets FILENAME

Target intervals (.bed or .list)

-a FILENAME, --antitargets FILENAME

Antitarget intervals (.bed or .list)

--annotate FILENAME

Use gene models from this file to assign names to the target regions. Format: UCSC refFlat.txt or ensFlat.txt file (preferred), or BED, interval list, GFF, or similar.

--short-names

Reduce multi-accession bait labels to be short and consistent.

--target-avg-size TARGET_AVG_SIZE

Average size of split target bins (results are approximate).

-g FILENAME, --access FILENAME

Regions of accessible sequence on chromosomes (.bed), as output by the ’access’ command.

--antitarget-avg-size ANTITARGET_AVG_SIZE

Average size of antitarget bins (results are approximate).

--antitarget-min-size ANTITARGET_MIN_SIZE

Minimum size of antitarget bins (smaller regions are dropped).

--output-reference FILENAME

Output filename/path for the new reference file being created. (If given, ignores the -o/--output-dir option and will write the file to the given path. Otherwise, "reference.cnn" will be created in the current directory or specified output directory.)

--cluster

Calculate and use cluster-specific summary stats in the reference pool to normalize samples.

To reuse an existing reference:

-r REFERENCE, --reference REFERENCE

Copy number reference file (.cnn).

Output options:

-d DIRECTORY, --output-dir DIRECTORY

Output directory.

--scatter

Create a whole-genome copy ratio profile as a PDF scatter plot.

--diagram

Create an ideogram of copy ratios on chromosomes as a PDF.