CSES Module 3 Data Set Errata
Posted: March 31, 2011

STATA storage types and binary-decimal precision

Users of STATA are likely to already be familiar with the binary-decimal precision issue in the storage of variables in STATA. Numbers that are provided to STATA as decimals are stored within STATA in a binary format. A decimal value that appears in the raw data as "0.67" might be read into STATA and stored in binary format as "0.67000001168930054". This issue is well documented by STATA and on the Internet.

In the STATA dictionary "cses3_columns.dct" that CSES provides, all variables which contain decimals in the raw data file "cses3_rawdata.txt" are set in the dictionary to be stored as data type "double". When reading the CSES dataset into STATA for the first time, STATA users may wish to consider whether for their analyses the use of the data type "double" is appropriate for each of these variables.

The CSES Secretariat would appreciate advice from the user community on ways to modify the STATA dictionary and syntax files to be optimized for use with the CSES data.