fsm-lite(1)

Frequency-based String Mining

Section 1 fsm-lite bookworm source

Description

FSM-LITE

NAME

fsm-lite - Frequency-based String Mining

SYNOPSIS

fsm-lite -l <file> -t <file> [options]

DESCRIPTION

A singe-core implementation of frequency-based substring mining used in bioinformatics to extract substrings that discriminate two (or more) datasets inside high-throughput sequencing data.

OPTIONS

mandatory:

-l,--list <file>

Text file that lists all input files as whitespace-separated pairs

<data-name> <data-filename>

where <data-name> is unique identifier (without whitespace) and <data-filename> is full path to each input file. Default data file format is FASTA (uncompressed).

-t,--tmp <file>

Store temporary index data

optional:

-m,--min <int>

Minimum length to report (default 9)

-M,--max <int>

Maximum length to report (default 100)

-f,--freq <int>

Minimum frequency per input file to report (default 1)

-s,--minsupp <int>

Minimum number of input files with support to report (default 2)

-S,--maxsupp <int>

Maximum number of input files with support to report (default inf)

-v,--verbose

Verbose output

AUTHOR

This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.