last-train(1)

Try to find suitable score parameters for aligning the given sequences

Section 1 last-align bookworm source

Description

LAST-TRAIN

NAME

last-train - Try to find suitable score parameters for aligning the given sequences

SYNOPSIS

last-train [options] lastdb-name sequence-file(s)

DESCRIPTION

Try to find suitable score parameters for aligning the given sequences.

OPTIONS

-h, --help

show this help message and exit

-v, --verbose

show more details of intermediate steps

Training options:

--revsym

force reverse-complement symmetry

--matsym

force symmetric substitution matrix

--gapsym

force insertion/deletion symmetry

--pid=PID

skip alignments with > PID% identity (default: 100)

--postmask=NUMBER

skip mostly-lowercase alignments (default=1)

--sample-number=N

number of random sequence samples (default: 20000 if --codon else 500)

--sample-length=L

length of each sample (default: 2000)

--scale=S

output scores in units of 1/S bits

--codon

DNA queries & protein reference, with frameshifts

Initial parameter options:

-r SCORE

match score (default: 6 if Q>=1, or 5 if DNA, or 12)

-q COST

mismatch cost (default: 18 if Q>=1, or 5 if DNA, or 7)

-p NAME

match/mismatch score matrix

-a COST

gap existence cost (default: 21 if Q>=1, else 15)

-b COST

gap extension cost (default: 9 if Q>=1, else 3)

-A COST

insertion existence cost

-B COST

insertion extension cost

-F LIST

frameshift probabilities: del-1,del-2,ins+1,ins+2 (default: 1-b,1-b,1-B,1-B)

Alignment options:

-D LENGTH

query letters per random alignment (default: 1e6)

-E EG2

maximum expected alignments per square giga

-s STRAND

0=reverse, 1=forward, 2=both (default: 2 if DNA, else 1)

-S NUMBER

score matrix applies to forward strand of: 0=reference, 1=query (default: 1)

-C COUNT

omit gapless alignments in COUNT others with > scoreper-length

-T NUMBER

type of alignment: 0=local, 1=overlap (default: 0)

-R DIGITS

lowercase & simple-sequence options

-m COUNT

maximum initial matches per query position (default: 10)

-k STEP

use initial matches starting at every STEP-th position in each query (default: 1)

-P THREADS

number of parallel threads

-X NUMBER

N/X is ambiguous in: 0=neither sequence, 1=reference, 2=query, 3=both (default=0)

-Q NAME

input format: fastx, sanger (default=fasta)