Data analysis

All statistics were computed using sampling weights estimated to represent the urban and/or rural population of the countries surveyed.

Ten plausible values were used for estimating measurement variance.

Replication methods used to estimate sampling variance: jackknife repeated replication, JK2, Fay’s balanced repeated replicate variant (FAY variant).

Secondary analysis

In order to have accurate measures of the three types of skills, the STEP skills surveys measure most of the cognitive, behavior, personality trait, and job-relevant sub-domains through multiple items. The first step of any analysis of these data is to combine this information in order to form an indicator for each of the sub-domains.

  • The first principle is that researchers should not innovate when the sub-domain has been empirically validated.
  • The second principle is that simple scales should be preferred.

Rather than simply aggregating the sub-domains into a single measure, researchers might want to identify a limited set of relevant sub-domains from the full battery based on a particular (scientific) criterion. Here we can identify three different empirical strategies:

  • Selection based on association
  • Selection based on malleability
  • Selection based on predictability

Scores obtained for the various questions measuring an individual’s skills can be transformed into interpretable objects.

For doing secondary analyses, the following tools have been designed to help SAS, Stata, or SPSS users explore the data and compute statistics:


IEA IDB Analyzer

WesVar (Wesvar User Guide, 2007)

International Data Explorer

Types of data files

One data file for each country, with household roster, individual background questionnaires, all derived variables and scales, achievement responses to assessment items (and timing), and scores

Format(s) of data files
Item release policy

All information is publically available and can be found on our STEP website

The STEP collection currently hosts data collected between March 2012 and July 2014 in Armenia, Bolivia, Colombia, Georgia, Azerbaijan Ghana, Lao PDR, Macedonia, Sri Lanka, Ukraine, Vietnam, and the Yunnan Province in China. More countries will be added as data become available.