skesa(1)

strategic Kmer extension for scrupulous assemblies

Section 1 skesa bookworm source

Description

SKESA

NAME

skesa - strategic Kmer extension for scrupulous assemblies

DESCRIPTION

SKESA is a DeBruijn graph-based de-novo assembler designed for assembling reads of microbial genomes sequenced using Illumina. Comparison with SPAdes and MegaHit shows that SKESA produces assemblies that have high sequence quality and contiguity, handles low-level contamination in reads, is fast, and produces an identical assembly for the same input when assembled multiple times with the same or different compute resources. SKESA has been used for assembling over 272,000 read sets in the Sequence Read Archive at NCBI and for real-time pathogen detection.

OPTIONS

General options:

-h [ --help ]

Produce help message

-v [ --version ]

Print version

--cores arg (=0)

Number of cores to use (default all) [integer]

--memory arg (=32)

Memory available (GB, only for sorted counter) [integer]

--hash_count

Use hash counter [flag]

--estimated_kmers arg (=100)

Estimated number of unique kmers for bloom filter (M, only for hash counter) [integer]

--skip_bloom_filter

Don’t do bloom filter; use --estimated_kmers as the hash table size (only for hash counter) [flag]

Input/output options : at least one input providing reads for assembly mustbe specified:

--fasta arg

Input fasta file(s) (could be used multiple times for different runs) [string]

--fastq arg

Input fastq file(s) (could be used multiple times for different runs) [string]

--use_paired_ends

Indicates that a single (not comma separated) fasta/fastq file contains paired reads [flag]

--sra_run arg

Input sra run accession (could be used multiple times for different runs) [string]

--contigs_out arg

Output file for contigs (stdout if not specified) [string]

Assembly options:

--kmer arg (=21)

Minimal kmer length for assembly [integer]

--min_count arg

Minimal count for kmers retained for comparing alternate choices [integer]

--max_kmer_count arg

Minimum acceptable average count for estimating the maximal kmer length in reads [integer]

--vector_percent arg (=0.05)

Count for vectors as a fraction of the read number (1. disables) [float (0,1]]

--insert_size arg

Expected insert size for paired reads (if not provided, it will be estimated) [integer]

--steps arg (=11)

Number of assembly iterations from minimal to maximal kmer length in reads [integer]

--fraction arg (=0.1)

Maximum noise to signal ratio acceptable for extension [float [0,1)]

--max_snp_len arg (=150)

Maximal snp length [integer]

--min_contig arg (=200)

Minimal contig length reported in output [integer]

--allow_snps

Allow additional step for snp discovery [flag]

Debugging options:

--force_single_ends

Don’t use paired-end information [flag]

--seeds arg

Input file with seeds [string]

--all arg

Output fasta for each iteration [string]

--dbg_out arg

Output kmer file [string]

--hist arg

File for histogram [string]

--connected_reads arg

File for connected paired reads [string]

AUTHOR

This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.