clm info2 - compute performance measures for graphs and clusterings.
clminfo2 is not in actual fact a program. This manual page documents the
behaviour and options of the clm program when invoked in mode info2
The options -h
are accessible in all clm
modes. They are described in the
[options] <graph file> <cluster file> <cluster
clm info2 [-o
fname (write to file
fname) ] [-pi f
(apply inflation beforehand)]
[--list (list efficiency for all
nodes)] [-tf spec (apply
tf-spec to input matrix )] [-cl-ceil
<num> ( skip clusters of size exceeding
[-cat-max <num> (do at most
<num> tree levels)]
[-cl-tree fname (expect file with nested
clusterings )] [-t <int>
(use <int> threads)]
[-J <intJ> (a total of <intJ> jobs are
used )] [-j <intj>
(this job has index <intj> )]
[-h (print synopsis, exit)]
[--apropos (print synopsis,
exit)] [--version (print
version, exit )] <matrix file> <cluster
file> <cluster file>*
is a streamlined and updated version of clm info
latter outputs a key-value format listing a number of measures. In contrast,
only outputs the so-called efficiency criterion, a quality
index for networks and clusterings. This criterion can be generated for each
node independently with the --list
option, indicating how well a
clustering captures the neighbour distribution of a given node.
can utilise threading and job dispatching. This may be useful
when dealing with very large graphs.
Multiple clusterings can be supplied on the command-line. Output is tabular,
each row corresponding with a clustering in the ordering as supplied on the
command line. Multiple columns will result only if node-wise output is induced
. By default a single number is produced for each individual
clustering: the mean of all node-wise scores for that clustering.
factor is described in  (see the REFERENCES
section). It tries to balance the dual aims of capturing a lot of edges or
edge weights and keeping the cluster footprint or area fraction small. The
efficiency number has several appealing mathematical properties, cf. .
fname (output file name
f (apply inflation beforehand
Apply inflation to the graph matrix and compute the performance measures for the
<tf-spec> (transform input matrix values
(list efficiency for all nodes
The efficiency scores for all nodes are given on a single line. Each clustering
specified corresponds to a single line.
fname (expect file with nested clusterings (cone format)
<num> (skip (nested) clusters of size exceeding
The specified file should contain a hierarchy of nested clusterings such as
generated by mclcm
. The output is then in a special format,
undocumented but easy to understand. Its purpose is to help cherrypick a
single clustering from a tree, in conjunction with the slightly experimental
and undocumented program mlmfifofum
The measure that is used is very slow to compute for large clusters, and
generally it will be outside any interesting range (i.e. it will be small).
to skip clusters exceeding the specified size - clm
will directly proceed to subclusters if they exist.
num (do at most num levels
This only has effect when used with -cl-tree
. clm info
at the most fine-grained level, working upwards.
<int> (use <int> threads
<intj> (this job has index <intj>
<intJ> (a total of <intJ> jobs are used
For very large graphs (millions of nodes) and clusterings with large clusters it
may be helpful to allow this program to use multiple CPUs. Additionally it is
possible to spread the computation over multiple jobs/machines. These three
options are described in the clmprotocols
manual page. The following
set of options, if given to as many commands, defines three jobs, each running
-t 4 -J 3 -j 0 -o out.0
-t 4 -J 3 -j 1 -o out.1
-t 4 -J 3 -j 2 -o out.2
The output can then be collected with
clxdo add_table out.[0-2]
Stijn van Dongen.
for an overview of all the documentation and the utilities
in the mcl family.
 Stijn van Dongen. Performance criteria for graph clustering and
Markov cluster experiments
. Technical Report INS-R0012, National
Research Institute for Mathematics and Computer Science in the Netherlands,
Amsterdam, May 2000.