cstocs -- charset encoding convertor for the Czech and Slovak languages.
cstocs [options] src_encoding dst_encoding [files ...]
cstocs il2 ascii < file | less
cstocs -i utf8 il2 file1 file2 file3
Cstocs is a simple conversion utility to change charset encoding of a text. It
reads either specified files or (if none specified) the standard input,
assumes that the input is encoded in "src_encoding" and ties to
reencode it into "dst_encoding". The result is written to the
Run "cstocs" without parameters to get short help and list of
Characters that are not defined in "src_encoding" are passed to the
If source text contains character, that is defined in "src_encoding"
but not in "dst_encoding", it can be handled several ways. For
example, character "e with caron" (symbol ecaron), and "d with
caron" (symbol dcaron) are included in the iso-8859-2 encoding, but not
in the iso-8859-1. If you will do reencoding of 8859-2 text to 8859-1, you may
want to do one of the following actions:
- Keep it the same, option "--nofillstring".
- Do not produce any output instead of "ecaron"
symbol, option "--null".
- Substitute some string (possibly a space) instead of both
ecaron and dcaron, options "--fillstring".
- Substitute a letter "d" instead of dcaron, and
"e" instead of ecaron. It is even possible to substitute string
instead of symbol, so you can replace the "AE" Latin character
with string "AE" (letter "A", and letter
"E"). Or you can replace a "plusminus sign" with a
string "+/-". These substitutions are described in the
- -i, -i.ext, --inplace.ext
- Files specified will be converted in-place, using Perl
"-i" facility. Optionaly, an extension for backup copies may be
specified after dot. This parameter has to be the first one, if
- --dir directory
- Encoding files are taken from directory instead of
the default, which is Cz/Cstocs/enc in the Perl lib tree. The
location of encoding files can also be changed using the CSTOCSDIR
environment variable, but the --dir option has the highest priority.
- --fillstring string
- If source text contains character, that is defined in the
"src_encoding" but not in the "dst_encoding" nor in
the accent file (or accent file is not used), it is replaced
by "string". The default is single space.
- Disable changes of characters that would otherwise have
fillstring applied. This is different from "--null" because that
cancels that character out.
- Completely equivalent to --fillstring "".
- --nochange or --noaccent
- Do not use the accent file at all.
- Use only those rules from the accent file, which
rewrite one character to one character. If this option is specified,
character "ecaron" will be rewritten to "e", but
"AE" character will not be rewritten to "AE"
- Use all rules from accent file. This is the default
Jan "Yenya" Kasprzak has done the original Un*x implementation.
Jan Pazdziora, email@example.com, created the Perl module version.