kmds(1)

control for population structure

Section 1 seer bookworm source

Description

KMDS

NAME

kmds - control for population structure

DESCRIPTION

Control for population structure. Filter kmers and create a matrix representing population structure.

This program belongs to seer(1) (Sequence Element (kmer) Enrichment Analysis).

OPTIONS

1) filter and subsample with --no_mds and --size

2) combine, and do metric multidimensional scaling with --mds_concat

Required options:

-k [ --kmers ] arg

dsm kmer output file

-p [ --pheno ] arg

.pheno metadata

MDS options:

-o [ --output ] arg

output prefix for new dsm file

--no_mds

do not perform MDS; output subsampled matrix instead

--write_distances

write csv of distance matrix

--mds_concat arg

list of subsampled matrices to use in MDS. Performs only MDS; implies --no_filtering

--pc arg (=3)

number of principal coordinates to output

--size arg (=1000000)

number of kmers to use in MDS

--threads arg (=1)

number of threads. Suggested: 4

Filtering options:

--no_filtering

turn off all filtering and do not output new kmer file

--max_length arg (=100)

maximum kmer length

--maf arg (=0.01)

minimum kmer frequency

--min_words arg

minimum kmer occurrences. Overrides --maf

Other options:

--version

prints version and exits

-h [ --help ]

full help message

EXAMPLE

Filter kmers and create a matrix representing population structure with kmds

kmds -k dsm_input.txt.gz --pheno metadata.pheno -o filtered

To spread this process out, run the following command on each dsm file

kmds -k dsm_input.txt.gz --pheno metadata.pheno --no_mds --size 10000

AUTHOR

This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.