salmon-index(1)

highly-accurate, transcript-level quantification estimates from RNA-seq data

Section 1 salmon bookworm source

Description

SALMON_INDEX

NAME

salmon_index - highly-accurate, transcript-level quantification estimates from RNA-seq data

DESCRIPTION

Index ========== Creates a salmon index.

Command Line Options:

-v [ --version ]

print version string

-h [ --help ]

produce help message

-t [ --transcripts ] arg

Transcript fasta file.

-k [ --kmerLen ] arg (=31)

The size of k-mers that should be used for the quasi index.

-i [ --index ] arg

salmon index.

--gencode

This flag will expect the input transcript fasta to be in GENCODE format, and will split the transcript name at the first ’|’ character. These reduced names will be used in the output and when looking for these transcripts in a gene to transcript GTF.

--features

This flag will expect the input reference to be in the tsv file format, and will split the feature name at the first ’tab’ character. These reduced names will be used in the output and when looking for the sequence of the features.GTF.

--keepDuplicates

This flag will disable the default indexing behavior of discarding sequence-identical duplicate transcripts. If this flag is passed, then duplicate transcripts that appear in the input will be retained and quantified separately.

-p [ --threads ] arg (=2)

Number of threads to use during indexing.

--keepFixedFasta

Retain the fixed fasta file (without short transcripts and duplicates, clipped, etc.) generated during indexing

-f [ --filterSize ] arg (=-1) The size of the Bloom filter that will be
used

by TwoPaCo during indexing. The filter will be of size 2ˆ{filterSize}. The default value of -1 means that the filter size will be automatically set based on the number of distinct k-mers in the input, as estimated by nthll.

--tmpdir arg

The directory location that will be used for TwoPaCo temporary files; it will be created if need be and be removed prior to indexing completion. The default value will cause a (temporary) subdirectory of the salmon index directory to be used for this purpose.

--sparse

Build the index using a sparse sampling of k-mer positions This will require less memory (especially during quantification), but will take longer to construct and can slow down mapping / alignment

-d [ --decoys ] arg

Treat these sequences ids from the reference as the decoys that may have sequence homologous to some known transcript. for example in case of the genome, provide a list of chromosome name --- one per line

-n [ --no-clip ]

Don’t clip poly-A tails from the ends of target sequences

--type arg (=puff)

The type of index to build; the only option is "puff" in this version of salmon.