hhfilter(1)
filter an alignment by maximum sequence identity of match states and minimum coverage
Description
HHFILTER
NAME
hhfilter - filter an alignment by maximum sequence identity of match states and minimum coverage
SYNOPSIS
hhfilter -i infile -o outfile [options]
DESCRIPTION
HHfilter 3.3.0
Filter an alignment by maximum pairwise sequence identity,
minimum coverage, minimum sequence identity, or score per
column to the first (seed) sequence.n(c) The HH-suite
development team Steinegger M, Meier M, Mirdita M,
V??hringer H, Haunsberger S J, and S??ding J (2019)
HH-suite3 for fast remote homology detection and deep
protein annotation. BMC Bioinformatics,
doi:10.1186/s12859-019-3019-7
-i <file>
read input file in A3M/A2M or FASTA format
-o <file>
write to output file in A3M format
-a <file>
append to output file in A3M format
OPTIONS
-v <int>
verbose mode: 0:no screen output 1:only warings 2: verbose
|
-id |
[0,100] maximum pairwise sequence identity (%) (def=90) |
-diff [0,inf[
filter MSA by selecting most diverse set of sequences, keeping at least this many seqs in each MSA block of length 50 (def=0)
|
-cov |
[0,100] minimum coverage with query (%) (def=0) |
|||
|
-qid |
[0,100] minimum sequence identity with query (%) (def=0) |
|||
|
-qsc |
[0,100] minimum score per column with query (def=-20.0) |
-neff [1,inf]
target diversity of alignment (default=off)
Input alignment format:
-M a2m
use A2M/A3M (default): upper case = Match; lower case = Insert; ’-’ = Delete; ’.’ = gaps aligned to inserts (may be omitted)
-M first
use FASTA: columns with residue in 1st sequence are match states
-M [0,100]
use FASTA: columns with fewer than X% gaps are match states
Other options:
-maxseq <int>
max number of input rows (def=65535)
-maxres <int>
max number of HMM columns (def=20001)
Example: hhfilter -id 50 -i d1mvfd_.a2m -o d1mvfd_.fil.a2m