lastal(1)

genome-scale comparison of biological sequences

Section 1 last-align bookworm source

Description

LASTAL

NAME

lastal - genome-scale comparison of biological sequences

SYNOPSIS

lastal-plain [options] lastdb-name fasta-sequence-file(s)

DESCRIPTION

Find and align similar sequences.

Cosmetic options:

-h, --help

show all options and their default settings, and exit

-V, --version

show version information, and exit

	-v		be verbose: write messages about what lastal is doing
	-f		output format: TAB, MAF, BlastTab, BlastTab+ (default: MAF)

E-value options (default settings):

	-D		query letters per random alignment (1e+06)
	-E		maximum expected alignments per square giga (1e+18/D/refSize/numOfStrands)

Score options (default settings):

	-r		match score (2 if -M, else 6 if 1<=Q<=4, else 1 if DNA)
	-q		mismatch cost (3 if -M, else 18 if 1<=Q<=4, else 1 if DNA)
	-p		match/mismatch score matrix (protein-protein: BL62, DNA-protein: BL80)
	-X		N/X is ambiguous in: 0=neither sequence, 1=reference, 2=query, 3=both (0)
	-a		gap existence cost (DNA: 7, protein: 11, 1<=Q<=4: 21)
	-b		gap extension cost (DNA: 1, protein: 2, 1<=Q<=4: 9)
	-A		insertion existence cost (a)
	-B		insertion extension cost (b)
	-c		unaligned residue pair cost (off)
	-F		frameshift cost(s) (off)
	-x		maximum score drop for preliminary gapped alignments (z)
	-y		maximum score drop for gapless alignments (min[t*10, x])
	-z		maximum score drop for final gapped alignments (e-1)
	-d		minimum score for gapless alignments (min[e, 2500/n query letters per hit])
	-e		minimum score for gapped alignments

Initial-match options (default settings):

	-m		maximum initial matches per query position (10)
	-l		minimum length for initial matches (1)
	-L		maximum length for initial matches (infinity)
	-k		use initial matches starting at every k-th position in each query (1)
	-W		use "minimum" positions in sliding windows of W consecutive positions

Miscellaneous options (default settings):

	-s		strand: 0=reverse, 1=forward, 2=both (2 for DNA, 1 for protein)
	-S		score matrix applies to forward strand of: 0=reference, 1=query (0)
	-K		omit alignments whose query range lies in >= K others with > score (off)
	-C		omit gapless alignments in >= C others with > score-per-length (off)
	-P		number of parallel threads (1)
	-i		query batch size (64M if multi-volume, else off)
	-M		find minimum-difference alignments (faster but cruder)
	-T		type of alignment: 0=local, 1=overlap (0)
	-n		maximum gapless alignments per query position (infinity if m=0, else m)
	-N		stop after the first N alignments per query strand
	-R		lowercase & simple-sequence options (the same as was used by lastdb)
	-u		mask lowercase during extensions: 0=never, 1=gapless, 2=gapless+postmask, 3=always (2 if lastdb -c and Q!=pssm, else 0)
	-w		suppress repeats inside exact matches, offset by <= this distance (1000)
	-G		genetic code (1)
	-t		’temperature’ for calculating probabilities (1/lambda)
	-g		’gamma’ parameter for gamma-centroid and LAMA (1)
	-j		output type: 0=match counts, 1=gapless, 2=redundant gapped, 3=gapped,

4=column ambiguity estimates, 5=gamma-centroid, 6=LAMA, 7=expected counts (3)

	-J		score type: 0=ordinary, 1=full (1 for new-style frameshifts, else 0)
	-Q		input format: fastx, keep, sanger, solexa, illumina, prb, pssm

(default: fasta)

Split options:

--split

do split alignment

--splice

do spliced alignment

--split-f=FMT

output format: MAF, MAF+

--split-d=D

RNA direction: 0=reverse, 1=forward, 2=mixed (default: 1)

--split-c=PROB

cis-splice probability per base (default: 0.004)

--split-t=PROB

trans-splice probability per base (default: 1e-05)

--split-M=MEAN

mean of ln[intron length] (default: 7.0)

--split-S=SDEV

standard deviation of ln[intron length] (default: 1.7)

--split-m=PROB

maximum mismap probability (default: 1.0)

--split-s=INT

minimum alignment score (default: e OR e+t*ln[100])

--split-n

write original, not split, alignments

--split-b=B

maximum memory (default: 8T for split, 8G for spliced)