lambda2-mkindexn(1)

the Local Aligner for Massive Biological DatA

Section 1 lambda-align2 bookworm source

Description

LAMBDA2_MKINDEXN

NAME

lambda2_mkindexn - the Local Aligner for Massive Biological DatA

SYNOPSIS

lambda2 mkindexn [OPTIONS] -d DATABASE.fasta [-i INDEX.lambda]

DESCRIPTION

Lambda is a local aligner optimized for many query sequences and searches in protein space. It is compatible to BLAST, but much faster than BLAST and many other comparable tools.

Detailed information is available in the wiki: <https://github.com/seqan/lambda/wiki>

This is the indexer_binary for creating lambda-compatible databases.

OPTIONS

-h, --help

Display the help message.

-hh, --full-help

Display the help message with advanced options.

--version

Display version information.

--copyright

Display long copyright information.

-v, --verbosity INTEGER

Display more/less diagnostic output during operation: 0 [only errors]; 1 [default]; 2 [+run-time, options and statistics]. In range [0..2]. Default: 1.

Input Options:

-d, --database INPUT_FILE

Database sequences. Valid filetypes are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*], .fq[.*], .fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any of the following extensions: gz, bz2, and bgzf for transparent (de)compression.

-m, --acc-tax-map INPUT_FILE

An NCBI or UniProt accession-to-taxid mapping file. Download from ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/ or ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/idmapping/ . Valid filetypes are: .dat[.*] and .accession2taxid[.*], where * is any of the following extensions: gz, bz2, and bgzf for transparent (de)compression.

-x, --tax-dump-dir INPUT_DIRECTORY

A directory that contains nodes.dmp and names.dmp; unzipped from ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz

Output Options:

-i, --index OUTPUT_DIRECTORY

The output directory for the index files (defaults to "DATABASE.lambda"). Valid filetype is: .lambda.

--db-index-type STRING

Suffix array or full-text minute space. One of fm and bifm. Default: fm.

--truncate-ids BOOL

Truncate IDs at first whitespace. This saves a lot of space and is irrelevant for all LAMBDA output formats other than BLAST Pairwise (.m0). One of 1, ON, TRUE, T, YES, 0, OFF, FALSE, F, and NO. Default: on.

Algorithm:

--algorithm STRING

Algorithm for SA construction (also used for FM; see Memory Requirements below!). One of mergesort, quicksortbuckets, quicksort, radixsort, and skew7ext. Default: radixsort.

-t, --threads INTEGER

number of threads to run concurrently. Default: autodetected.

--tmp-dir OUTPUT_DIRECTORY

temporary directory used by skew, defaults to working directory.

REMARKS

Please see the wiki (<https://github.com/seqan/lambda/wiki>) for more information on which indexes to chose and which algorithms to pick.

Note that the indexes created are binary and not compatible between different CPU endiannesses. Also the on-disk format is still subject to change between Lambda versions.

LEGAL

lambda2 mkindexn Copyright: 2013-2019 Hannes Hauswedell, released under the GNU AGPL v3 (or later); 2016-2019 Knut Reinert and Freie Universität Berlin, released under the 3-clause-BSDL
SeqAn Copyright:
2006-2015 Knut Reinert, FU-Berlin; released under the 3-clause BSDL.
In your academic works please cite:
Hauswedell et al (2014); doi: 10.1093/bioinformatics/btu439
For full copyright and/or warranty information see --copyright.