News & Articles

With the introduction of Demscore v3, you can merge datasets in a dyadic format with datasets in a country-year format.

Let’s say you want to run an analysis using variables from the UCDP Dyadic and the V-Dem Country-Year dataset. These variables are difficult to combine, as the datasets collect data on different levels of analysis: the former on a dyad-year level, and the latter on a country-year level. See example below:

tabel 1 plain

Table 1 shows an example of the UCDP dyad and year dataset. This table has three columns 'dyad_id', 'year', and 'location' with corresponding values in 4 rows.

tabel 2 plain

Table 2 shows an example of the V-Dem Country and Year dataset, with two columns 'country' and 'year', with corresponding values in six rows.

The Demscore v3 update represents a significant advancement, providing greater flexibility and compatibility for data analysis purposes. With the new dyad-location-year Output Format, we create an Output Unit that can append both datasets, as it includes one observation per location, dyad and year.

table 3 plain

To create this unit we stretch the UCDP Dyadic dataset using the comma-separated observations in the location column, i.e. creating one row per location, dyad, and year.

table 4 plain

For variables coming from the V-Dem Country-Year dataset, we can now match years to years and countries to locations.

If certain observations from a variable do not have a match in the end Output Unit, they get the value -11111 (“missing from merge”).

Disclaimer: Please note that we merge V-Dem countries to UCDP locations! A country in V-Dem is a political unit enjoying at least some degree of functional and/or formal sovereignty, while a location in UCDP is either the location in which an event takes place, or the country of the incompatibility/actor.

More articles like this

Those with experience in merging data can attest that it is an exceedingly demanding task, and if you are unlucky, it can consume a significant portion of your time devoted to a project. At least that used to be the case before the launch of Demscore, a research e-infrastructure that streamlines data merging and harmonization.