dirfile — a filesystem-based database format for time-ordered binary data
database format is designed to provide a fast, simple format
for storing and reading binary time-ordered data. Dirfiles can be read using
the GetData Library, which provides a reference implementaiton of these
The dirfile database is centred around one or more time-ordered data streams (a
). Each time stream is written to the filesystem in a
separate file, as binary data. The name of these binary files correspond to
the time stream's field name
. Dirfiles support binary data fields for
signed and unsigned integer types of 8 to 64 bits, as well as single and
double precision floating-point real or complex data types.
Two time streams may have different constant sampling frequencies and mechanisms
exist within the dirfile format to ensure these time streams remain properly
sequenced in time.
To do this, the time streams in the dirfile are subdivided into frames
Each frame contains a fixed integer number of samples of each time stream. Two
time streams in the same dirfile may have different numbers of samples per
frame, but the number of samples per frame of any given time stream is fixed.
When synchronous retrieval of data from more than one time stream is required,
position in the dirfile can be specified in frames, which will ensure
The binary files are all located in one ore more filesystem directories, rooted
around a central directory, known as the dirfile directory
. The dirfile
as a whole may be referred to by its dirfile directory path.
Included in the dirfile along with the time streams is the dirfile format
, which is one or more ASCII text files containing the
dirfile database metadata. The primary file is the file called format
located in the dirfile directory. This file and any additional files that it
names, fully specify the dirfile's metadata. For the syntax of these files,
Version 3 of the Dirfile Standards introduced the large dirfile
extension. This extension added the ability to distribute the dirfile metadata
among multiple files (called fragments
) in addition to the
file, as well as the ability to house portions of the database
. These subdirfiles may be fully fledged dirfiles in
their own right, but may also be contained within a larger, parent dirfile.
(5) for information on specifying these subdirfiles.
In addition to the raw fields on disk, the dirfile format specification may also
specify derived fields
which are calculated by performing simple
element-wise operations on one or more input fields. Derived fields behave
identically to raw fields when read via GetData. See dirfile-format
for a complete list of derived field types. Dirfiles may also contain both
numerical and character string constant scalar fields
, also further
outlined in dirfile-format
Dirfiles are designed to be written to and read simultaneously. The dirfile
specification dictates that one particular raw field (specified either
explicitly or implicitly by the dirfile metadata) is to be used as the
: all other vector fields are assumed to have at least
as many frames as the reference field has, and the size (in frames) of the
reference field is used as the size of the dirfile as a whole.
Version 6 of the Dirfile Standards added the ability to encode the binary files
on disk. Each fragment
may have its own encoding scheme. Most commonly,
encodings are used to compress the data files to same space. See
(5) for information on encoding schemes.
Version 7 of the Dirfile Standards added support for complex valued data. Two
types of complex valued data are supported by the Dirfile Standards:
- A 64-bit complex number consisting of a IEEE-754 standard
32-bit single precision floating point real part and a IEEE-754 standard
32-bit single precision floating point imaginary part, and
- A 128-bit complex number consisting of a IEEE-754 standard
64-bit double precision floating point real part and a IEEE-754 standard
64-bit double precision floating point imaginary part.
No integer-type complex numbers are supported.
Unencoded complex numbers are stored on disk in "Fortran order", that
is with the IEEE-754 real part followed by the IEEE-754 imaginary part. The
specified endianness of the two components follows that of purely real
floating point numbers. Endianness does not affect the ordering of the real
and imaginary parts. This format also conforms to the C99 and C++11 standards.
To aid in using complex valued data, dirfile field codes may contain a
which specifies a function to apply to the
complex valued data to map it into purely real data. See
The Dirfile format was created by C. B. Netterfield
<email@example.com>. It is now maintained by D. V. Wiebe
For an introduction to the GetData Library reference implementation, see
(3), or visit the GetData Project