Lost Observations

What gets lost in translation?

Demscore aims to to improve transparency on which observations are lost during a translation on the level of output units. Let’s assume that we want to merge a V-Dem variable from the V-Dem Country-Year output unit to a QoG variable from the QoG Country-Year output unit.

While Algeria 1991 and 1992 has missing values in the original V-Dem variable and the variable has no data on France 2000 and 2001, the V-Dem Country-Year output unit covers additional Country-Year observations to those available in the QoG country-year output unit. These additional identifier combinations get ”lost” in the translation. We can create a merge report file that contains information on which identifier combination did not have a match with identifier combinations the chosen end output unit. For this specific example, the file would contain a table looking as follows:

For the first full version of Demscore, we make files available on this page containing information on which identifier combinations from the original output unit do not have a match in a chosen end output unit. This comparison is not as fine-grained as we aim to be in the future, but the comparison between output unit identifiers is still suitable to indicate the overarching differences in country definitions across modules. For this version of Demscore, you can find the lost identifier combinations for merges between country-year output units below.