6.
Documentation
The data archive documentation has two levels: one type of
documentation is on the level of the data files and the other is
on the level of the variables in the data files.
-
6.1. File documentation
The file documentation describes the whole of a dataset. A large
part of the documentation deals with the origins of the data,
such as
principal investigators, research institute and sponsors. Another
section provides a detailed checklist for technical and
methodological background information, like data collection method,
sampling procedure, weighting or data format.
This is called the study description which refers to the research
project where the data originated from.
A number of
archives have agreed on a standardised format for describing their
respective holdings: the Standardised Study Description Scheme (1).
-
6.2 Data documentation
Data documentation deals with the variables in a dataset. Most
datasets contain merely numerical coded information that cannot
be
understood without proper documentation.
The variables in a dataset are described extensively in a
codebook: what do the codes of a variable mean, from which
question is this variable a result, which code is used as a
missing code when there is no information available for a specific
case etc.?
The documentation can be added to the data in machine-readable
form. The
large statistical software packages are able to combine both the
data and the documentation in one file for analysis.
 |