gbdummyfy - Produce dummies from labels
This command reads from standard input a text file with space separated columns.
The entry in one column (the first by default) are considered labels and
expanded into a matrix of dummies , i.e. of 0 and 1 values. The number of
columns of the matrix is equal to the number of different labels. Each row
contains '1' in the place of the associated labels in the sorted list of
labels, and '0' everywehere else. Since in general one less dummy variable is
required than the number of labels, you can remove one column of dummies using
the option '-d'.
- print this help
- set the column of labels (default 1)
- which column to remove, counting from 1 (default none)
- print the labels and associated positions to standard
- echo "a 1\nb 2" | gbdummyfy
- create a 4x3 marix with dummy values relative to labels 'a'
This program requires awk or gawk. Notice that it simply expands the data adding
new columns. When using the resulting the resulting matrix in other utilities,
the user should specify explicitly which dummies variable to use and how.
A simple linear dependency can be automatically generated for 'gblreg' by
inserting the following expression in the functional specification
`seq 3 12 | sed 's/\(.*\)/\+d\1\*x\1/' | tr -d '\n'`
`seq 3 12 | sed 's/\(.*\)/,\1=0/' | tr -d '\n'`
among the initial conditions. In this case there are 10 different values for the
dummy. They occupy column positions from 3 to 12 and their initial value is
Written by Giulio Bottazzi
Report bugs to <email@example.com>
Package home page
Copyright © 2009-2015 Giulio Bottazzi
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License (version 2) as published by the
Free Software Foundation.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
A PARTICULAR PURPOSE. See the GNU General Public License for more