cuteSV(1)

prediction of structural variants from sequence alignments

Section 1 cutesv bookworm source

Description

CUTESV

NAME

cuteSV - prediction of structural variants from sequence alignments

DESCRIPTION

usage: cuteSV [-h] [--version] [-t THREADS] [-b BATCHES] [-S SAMPLE]

[--retain_work_dir] [--report_readid] [-p MAX_SPLIT_PARTS] [-q MIN_MAPQ] [-r MIN_READ_LEN] [-md MERGE_DEL_THRESHOLD] [-mi MERGE_INS_THRESHOLD] [-s MIN_SUPPORT] [-l MIN_SIZE] [-L MAX_SIZE] [-sl MIN_SIGLENGTH] [--genotype] [--gt_round GT_ROUND] [-Ivcf IVCF] [--max_cluster_bias_INS MAX_CLUSTER_BIAS_INS] [--diff_ratio_merging_INS DIFF_RATIO_MERGING_INS] [--max_cluster_bias_DEL MAX_CLUSTER_BIAS_DEL] [--diff_ratio_merging_DEL DIFF_RATIO_MERGING_DEL] [--max_cluster_bias_INV MAX_CLUSTER_BIAS_INV] [--max_cluster_bias_DUP MAX_CLUSTER_BIAS_DUP] [--max_cluster_bias_TRA MAX_CLUSTER_BIAS_TRA] [--diff_ratio_filtering_TRA DIFF_RATIO_FILTERING_TRA] [BAM] reference output work_dir

Current version: v1.0.11 Author: Tao Jiang Contact: tjiang@hit.edu.cn

If you use cuteSV in your work, please cite:

Jiang T et al. Long-read-based human genomic structural variation detection with cuteSV. Genome Biol 21,189(2020). https://doi.org/10.1186/s13059-020-02107-y

Suggestions:

For PacBio CLR data:

--max_cluster_bias_INS

100

--diff_ratio_merging_INS

0.3

--max_cluster_bias_DEL

200

--diff_ratio_merging_DEL

0.5

For PacBio CCS(HIFI) data:

--max_cluster_bias_INS

1000

--diff_ratio_merging_INS

0.9

--max_cluster_bias_DEL

1000

--diff_ratio_merging_DEL

0.5

For ONT data:

--max_cluster_bias_INS

100

--diff_ratio_merging_INS

0.3

--max_cluster_bias_DEL

100

--diff_ratio_merging_DEL

0.3

positional arguments:

[BAM]

Sorted .bam file form NGMLR or Minimap2.

reference

The reference genome in fasta format.

output

Output VCF format file.

work_dir

Work-directory for distributed jobs

optional arguments:

-h, --help

show this help message and exit

--version, -v

show program’s version number and exit

-t THREADS, --threads THREADS

Number of threads to use.[16]

-b BATCHES, --batches BATCHES

Batch of genome segmentation interval.[10000000]

-S SAMPLE, --sample SAMPLE

Sample name/id

--retain_work_dir

Enable to retain temporary folder and files.

--report_readid

Enable to report supporting read ids for each SV.

Collection of SV signatures:

-p MAX_SPLIT_PARTS, --max_split_parts MAX_SPLIT_PARTS

Maximum number of split segments a read may be aligned before it is ignored. All split segments are considered when using -1. (Recommand -1 when applying assembly-based alignment.)[7]

-q MIN_MAPQ, --min_mapq MIN_MAPQ

Minimum mapping quality value of alignment to be taken into account.[20]

-r MIN_READ_LEN, --min_read_len MIN_READ_LEN

Ignores reads that only report alignments with not longer than bp.[500]

-md MERGE_DEL_THRESHOLD, --merge_del_threshold MERGE_DEL_THRESHOLD

Maximum distance of deletion signals to be merged. In our paper, I used -md 500 to process HG002 real human sample data.[0]

-mi MERGE_INS_THRESHOLD, --merge_ins_threshold MERGE_INS_THRESHOLD

Maximum distance of insertion signals to be merged. In our paper, I used -mi 500 to process HG002 real human sample data.[100]

Generation of SV clusters:

-s MIN_SUPPORT, --min_support MIN_SUPPORT

Minimum number of reads that support a SV to be reported.[10]

-l MIN_SIZE, --min_size MIN_SIZE

Minimum size of SV to be reported.[30]

-L MAX_SIZE, --max_size MAX_SIZE

Maximum size of SV to be reported.[100000]

-sl MIN_SIGLENGTH, --min_siglength MIN_SIGLENGTH

Minimum length of SV signal to be extracted.[10]

Computing genotypes:

--genotype

Enable to generate genotypes.

--gt_round GT_ROUND

Maximum round of iteration for alignments searching if perform genotyping.[500]

Force calling:

-Ivcf IVCF

Optional given vcf file. Enable to perform force calling. [NULL]

Advanced:

--max_cluster_bias_INS MAX_CLUSTER_BIAS_INS

Maximum distance to cluster read together for insertion.[100]

--diff_ratio_merging_INS DIFF_RATIO_MERGING_INS

Do not merge breakpoints with basepair identity more than [0.3] for insertion.

--max_cluster_bias_DEL MAX_CLUSTER_BIAS_DEL

Maximum distance to cluster read together for deletion.[200]

--diff_ratio_merging_DEL DIFF_RATIO_MERGING_DEL

Do not merge breakpoints with basepair identity more than [0.5] for deletion.

--max_cluster_bias_INV MAX_CLUSTER_BIAS_INV

Maximum distance to cluster read together for inversion.[500]

--max_cluster_bias_DUP MAX_CLUSTER_BIAS_DUP

Maximum distance to cluster read together for duplication.[500]

--max_cluster_bias_TRA MAX_CLUSTER_BIAS_TRA

Maximum distance to cluster read together for translocation.[50]

--diff_ratio_filtering_TRA DIFF_RATIO_FILTERING_TRA

Filter breakpoints with basepair identity less than [0.6] for translocation.