Data archive workflow



The social science data archive step by step

Ekkehard Mochmann (Central Archive for Empirical Social Research, Cologne, FRG)
Paul de Guchteneire (UNESCO, Paris, France)
 


Content:
 

1. Identification of datasets
2. Sources of data
3. Selection criteria
4. Data transfer to the archive
5. Data processing

6. Documentation
7. Storage
8. Information retrieval
9. Dissemination of data

10. Notes


6. Documentation

The data archive documentation has two levels: one type of documentation is on the level of the data files and the other is on the level of the variables in the data files.

6.1. File documentation

The file documentation describes the whole of a dataset. A large part of the documentation deals with the origins of the data, such as principal investigators, research institute and sponsors. Another section provides a detailed checklist for technical and methodological background information, like data collection method, sampling procedure, weighting or data format.

This is called the study description which refers to the research project where the data originated from.

A number of archives have agreed on a standardised format for describing their respective holdings: the Standardised Study Description Scheme (1).

6.2 Data documentation

Data documentation deals with the variables in a dataset. Most datasets contain merely numerical coded information that cannot be understood without proper documentation. The variables in a dataset are described extensively in a codebook: what do the codes of a variable mean, from which question is this variable a result, which code is used as a missing code when there is no information available for a specific case etc.?

The documentation can be added to the data in machine-­readable form. The large statistical software packages are able to combine both the data and the documentation in one file for analysis.


 
Copyright © IFDOnet - All rights reserved - Contact - 11-05-2005