clmclose(1)
Fetch connected components from graphs or subgraphs
Description
clm close
NAME
clm_close - Fetch connected components from graphs or subgraphs
clmclose is not in actual fact a program. This manual page documents the behaviour and options of the clm program when invoked in mode close. The options -h, --apropos, --version, -set, --nop are accessible in all clm modes. They are described in the clm manual page.
SYNOPSIS
clm close -imx <fname> [options]
clm close -imx fname (specify matrix input) -abc fname (specify label input) -dom fname (input domain/cluster file) [-o fname (output file)] [--is-undirected (trust input graph to be undirected)] [-levels LO/STEP/HI[/prefix] (write cluster size distribution for each cutoff)] [-levels-norm num (divide each level by num to define cutoff)] [--write-count (output component count)] [--write-sizes (output component sizes (default))] [--write-size-counts (output compressed list of component sizes)] [--write-cc (output components as clustering)] [--write-block (output graph restricted to -dom argument)] [--write-blockc (output graph complement of -dom argument)] [-cc-bound num (select components with size at least num)] [--sl (output single linkage tree as list of joins (for -imx input))] [-write-sl-list fname (write list of join order with weights)] [-tf spec (apply tf-spec to input matrix)] [-h (print synopsis, exit)] [--apropos (print synopsis, exit)] [--version (print version, exit)]
DESCRIPTION
Use clm close to fetch the connected components from a graph. Different output modes are supported (see below). In matrix mode (i.e. using the -imx option) the output returned with --write-cc can be used in conjunction with mcxsubs to retrieve individual subgraphs corresponding to connected components.
OPTIONS
-abc
<fname> (label input)
The file name for input that is in label format.
-imx
<fname> (input matrix)
The file name for input that is in mcl native matrix
format.
-o fname
(output file)
Specify the file where output is sent to. The default is
STDOUT.
-dom
fname (input domain/cluster file)
If this option is used, clm close will, as a first step, for
each of the domains in file fname retrieve the
associated subgraph from the input graph. These are then
further decomposed into connected components, and the
program will process these in the normal manner.
--write-count
(output component count)
--write-sizes (output component sizes (default))
--write-size-counts (output compressed list of
component sizes)
--write-cc (output components as clustering)
--write-block (output graph restricted to -dom
argument)
--write-blockc (output graph complement of -dom
argument)
The default behaviour is currently to output the sizes of
the connected components. It is also possible to simply
output the number of components with --write-count,
to write a counted list of sizes with
--write-size-counts, or to write the components as a
clustering in mcl format with -write-cc. Even more
options exist: it is possible to output the restriction of
the input graph to a domain, or to output the complement of
this restriction.
-levels
LO/STEP/HI[/prefix] (write cluster size distribution for
each cutoff)
-levels-norm num (divide each level by num to define
cutoff)
Use -levels to inspect the cluster size distribution
at various cut-offs by specifying a triplet of numbers
(separated by forward slashes), the first of which is the
starting point, the second is the step size, and the third
is the end point. If a fourth argument (preceded by another
slash) is given, all clusterings are written to a file based
on the supplied argument as file name prefix. The cut-off
can be further varied by the argument to
-levels-norm.
--sl
(output single linkage tree as list of joins (for -imx
input))
-write-sl-list fname (write list of join order with
weights)
A primary use case for this is to apply single link
clustering to the rcl (restricted contingency linkage) graph
that is output by clm vol with its write-rcl
option. This rcl graph encodes a consensus clustering
derived from the multiple clusterings that are given to
clm vol.
The output (save with -o or UNIX redirection) can be supplied to rcl-res.pl with a list of varying resolution parameters to produce a small number of nested clusterings. The resolution parameters (second and subsequent arguments) to rcl-res.pl are set sizes; For each of the supplied resolutions res the script will descend the tree as long as the current node has some split below it where both clusters are of size at least res. Note that the resulting clustering may still have smaller clusters and singletons (resulting from other splits).
The mcl distribution has an example script graphs/rcl-example.sh that illustrates the different steps.
--is-undirected
(omit graph undirected check)
With this option the transformation to make sure that the
input is undirected is omitted. This will be slightly
faster. Using this option while the input is directed may
lead to erronenous results.
-cc-bound
num (select components with size at least num)
Transform the input matrix values according to the syntax
described in mcxio(5).
AUTHOR
Stijn van Dongen.
SEE ALSO
mclfamily(7) for an overview of all the documentation and the utilities in the mcl family.