tantan(1)
low complexity and tandem repeat masker for biosequences
Description
TANTAN
NAME
tantan - low complexity and tandem repeat masker for biosequences
SYNOPSIS
tantan [options] fasta-sequence-file(s)
DESCRIPTION
Find simple repeats in sequences
Options (default settings):
|
-p |
interpret the sequences as proteins |
|||
|
-x |
letter to use for masking, instead of lowercase |
|||
|
-c |
preserve uppercase/lowercase in non-masked regions |
|||
|
-m |
file for letter pair scores (+1/-1, but -p selects BLOSUM62) |
|||
|
-r |
probability of a repeat starting per position (0.005) |
|||
|
-e |
probability of a repeat ending per position (0.05) |
|||
|
-w |
maximum tandem repeat period to consider (100, but -p selects 50) |
|||
|
-d |
probability decay per period (0.9) |
|||
|
-a |
gap existence cost (0) |
|||
|
-b |
gap extension cost (infinite: no gaps) |
|||
|
-s |
minimum repeat probability for masking (0.5) |
|||
|
-f |
output type: 0=masked sequence, 1=repeat probabilities, |
2=repeat counts, 3=BED (0)
-h, --help
show help message, then exit
--version
show version information, then exit
REPORTING BUGS
Report bugs to:
tantan@cbrc.jp
Home page: http://www.cbrc.jp/tantan/