**............................................................................** ** ** ** COMPARATIVE STUDY OF ELECTORAL SYSTEMS (CSES) ** ** Stata Example Syntax for Bridging polity-level data to CSES data: ** ** Bridging the CSES INTEGRATED MODULE DATASET (IMD) PHASE 3 Release to ** ** the V-Dem Core Dataset, Version 10 ** ** ** ** CSES IMD Version: IMD PHASE 3 RELEASE ** ** V-Dem Version: V-Dem Core Dataset - Version 10 ** ** Default Directory: c:\cses\CSES_data_bridging_project ** ** ** ** Website: www.cses.org ** ** Email: cses@umich.edu ** ** Last Updated: November 23, 2020 ** ** Created by: CSES Secretariat ** **............................................................................** ******************************************************************************** ******************************************************************************** **>>> SYNTAX FILE: SECTIONS AND TABLE OF CONTENTS ** 1: Instructions for Navigating Syntax File, Purpose of Syntax File, STATA ** Setup, and Set Working Directory ** 2: Loading the CSES IMD Dataset & Preparing CSES IMD Dataset For Bridging ** with variables from V-Dem Dataset ** 3: Loading the V-Dem Dataset & Preparing the V-Dem Dataset For Bridging ** with variables from CSES IMD ** 4: Bridging the CSES IMD Dataset with the V-Dem Dataset ** 5: Finalizing the Bridged Dataset ******************************************************************************** ******************************************************************************** ******************************************************************************** ******************************************************************************** **#>>> 1: INSTRUCTIONS FOR NAVIGATING SYNTAX FILE, PURPOSE OF SYNTAX FILE ** STATA SETUP AND SET WORKING DIRECTORY ******************************************************************************** ******************************************************************************** ******************************************************************************** **\\\ 1.1 INSTRUCTIONS FOR NAVIGATING SYNTAX FILE ******************************************************************************** ** #>>> = Section Heading *** \\\ = Sub-section Heading ** // = Instruction Heading ** ACHTUNG = Highlighting any issues including potential file path changes ** needed to successfully run the file. ******************************************************************************** **\\\ 1.2 PURPOSE OF SYNTAX FILE INSTRUCTIONS ******************************************************************************** ** This syntax file is intended for less experienced users of STATA. ** For advanced STATA users, we recommend that you use the Light File ** which provides only minimal comments for commands employed in the file. ** This is available on the CSES Data Bridging Webpage - see Light File. ** This syntax outlines how to merge all variables included in the V-Dem Core ** Dataset, Version 10 to the CSES IMD Phase 3 Release dataset. ** Variables from V-Dem can also be linked to other CSES Data Products ** using this syntax example be specifying a different CSES data product. ******************************************************************************** **\\\ 1.3 CLEARING THE STATA WORKSPACE ******************************************************************************** ** // CLOSE ALL OPEN DATA FILES, GRAPH WINDOWS, AND DIALOG BOXES clear all ** // CLOSE ANY OPEN LOG-FILES capture log close ******************************************************************************** **\\\ 1.4 DOWNLOAD THE CSES IMD PHASE 3 AND V-DEM DATASETS AND ** SET THE WORKING DIRECTORY ******************************************************************************** ** // DOWNLOAD THE CSES IMD PHASE 3 DATASET: ** The current release of the CSES IMD Dataset is available via ** https://cses.org/data-download/cses-integrated-module-dataset-imd/ ** // DOWNLOAD THE V-DEM DATASET: ** The current release of the V-Dem Core Dataset - Version 10 is available ** via https://www.v-dem.net/en/data/data/v-dem-dataset/ ** ACHTUNG: BEFORE RUNNING THE SYNTAX FILE, CSES strongly advises users to save ** the CSES IMD and V-Dem datasets into the SAME(!) folder. ** We suggest the following directory: ** C:\cses\CSES_data_bridging_project ** // SPECIFY THE WORKING DIRECTORY ** The following command sets the working directory, i.e. the folder from ** which datasets are loaded into Stata and into which Stata places saved ** datasets. ** ** ACHTUNG: Running the syntax requires the CSES IMD and V-Dem datasets ** to be placed in the working directory specified below. We suggest ** the following: cd "C:\cses\CSES_data_bridging_project" ******************************************************************************** **\\\ 1.5 PARAMETERS FOR DATA BRIDGING TO FUNTION IN STATA ******************************************************************************** ** In STATA, bridging data from two or more different sources requires key ** variables with unit identifiers that: ** a) are included in both datasets ** b) have the same variable name (i.e. column names should be identical) ** The following key variables are required for bridging CSES to V-Dem: ** 1) a variable identifying the polity ** 2) a variable identifying the election year ******************************************************************************** ******************************************************************************** **#>>> 2: LOADING THE CSES IMD DATASET & PREPARING CSES IMD DATASET FOR ** BRIDGING WITH VARIABLES FROM V-Dem ** The CSES IMD dataset includes the following identifiers required for ** bridging data to V-Dem: ** 1) Polity Identifier: IMD1006_NAM ** 2) Election Year Identifier: IMD1008_YEAR ******************************************************************************** ******************************************************************************** ** // LOAD CSES IMD DATASET use cses_imd.dta, clear ** // TABULATE POLITY ID VARIABLE IN CSES FOR BRIDGING ** (variable name in CSES: polity identifier IMD1006_VDem) tabulate IMD1006_VDem, missing ** // TABULATE ELECTION YEAR ID VARIABLE IN CSES FOR BRIDGING ** (variable name in CSES: time identifier IMD1008_YEAR: Election Year) tabulate IMD1008_YEAR, missing ** // SAVE CSES IMD DATASET - CSES FILE READY FOR BRIDGING: save cses_imd_formerging.dta, replace ******************************************************************************** ******************************************************************************** **#>>> 3: LOADING THE V-Dem DATASET & PREPARING V-Dem DATASET ** FOR MERGING WITH CSES IMD ** the V-Dem dataset includes the following identifiers required for ** bridging data to CSES: ** 1) Polity Identifier: country_id ** 2) Election Year Identifier: year * However, for STATA to conduct data bridging, identifiers have to be named the * same in the CSES and the V-Dem dataset. Therefore, the following lines generate * identifiers named according to CSES Standards: ******************************************************************************** ******************************************************************************** ** // LOAD V-Dem DATASET use "V-Dem-CY-Core-v10.dta", clear ** // CREATE POLITY ID VARIABLE IN V-Dem TO ENABLE MERGING WITH CSES ** (variable name in V-Dem: country_id) ** (Note: save country_id as IMD1006_VDem which is the VDem identifier variable ** in CSES IMD): generate IMD1006_VDem = country_id ** // CREATE ELECTION YEAR ID VARIABLE IN V-Dem TO ENABLE MERGING WITH CSES ** (variable name in V-Dem: year) ** (Note: save year as IMD1008_YEAR which is the election year identifier variable ** in CSES IMD): generate IMD1008_YEAR = year ** // SAVE V-Dem DATASET WITH NEWLY CREATED VARIABLES - FILE READY FOR BRIDGING save VDem10_formerging, replace ******************************************************************************** ******************************************************************************** ** #>>> 4: MERGING THE CSES IMD AND V-Dem DATASETS TOGETHER ******************************************************************************** ******************************************************************************** ******************************************************************************** **\\\ 4.1 IMPORTANT NOTE ON MERGING VARIABLES IN STATA: ******************************************************************************** ** When conducting a merge, STATA generates a new variable typically called _merge. ** This variable conventionally has the following three categories, labeled as ** "master only", "using only", and "matched". Tabulating this variable will give ** you an indication of the following: ** ** master only (1): identifies cases present in CSES Data only. The CSES data is ** called the "master data" because it was loaded into STATA ** when merging was performed. ** using only (2): identifies cases present in V-Dem Data only. The V-Dem data is ** called the "using data" because it was added to the CSES ** data already loaded into Stata when merging was performed. ** matched (3): identifies cases where data from V-Dem was merged successfully ** to CSES data. ** ** All cases that were merged successfully are awarded code "3. matched". ******************************************************************************** **\\\ 4.2 MERGING THE V-Dem DATA TO CSES IMD ******************************************************************************** ** // LOADING THE SAVED CSES IMD DATASET INCLUDING IDENTIFIERS: use cses_imd_formerging.dta, clear ** // MERGE V-Dem DATASET TO CSES IMD DATASET: ** (Note: merging procedure specified as "many to one" (m:1) as all respondents ** from the same election study are merged to exactly one corresponding ** V-Dem Polity - Election Year). ** Merge by polity (IMDD1006_VDem) and by election year (IMD1008_YEAR): merge m:1 IMD1006_VDem IMD1008_YEAR using "VDem10_formerging.dta" ** // INVESTIGATING THE MERGING PROCESS WITH STATA _merge VARIABLE: tabulate _merge * Note: all cases in CSES IMD have been merged successfully to V-Dem as category * "1. master only (1)" is empty. ******************************************************************************** ******************************************************************************** * #>>> 5: FINALIZING THE BRIDGED DATASET ******************************************************************************** ******************************************************************************** * After conducting the merge, the expanded dataset now includes all variables * included in the V-Dem Core Dataset, Version 10. * In what follows, the data is cleaned in such a way that all polity-years * not represented in CSES are removed from the final dataset: ** // DROP CASES ONLY PRESENT IN V-Dem DATA (NO CSES EQUIVALENT): drop if _merge == 2 ** // DROP MERGE CONTROL VARIABLE: drop _merge ** // SAVING THE FINAL DATASET: save cses_imd_vDem10_core.dta, replace ************************************************** *END OF DO-FILE