Historical development
While international large-scale assessments (ILSAs) are now regarded by many as a regular feature of the educational assessment landscape, they remain a relatively recent phenomenon. Their origins can be traced back to the IEA pilot survey, an early investigation into student performance conducted in the 1960’s by the International Association for the Evaluation of Educational Achievement (IEA). Since that time there have been significant developments (this text is partly based on Wagemaker, 2014, pp.17):
Number of organizations
The number of organizations which are responsible for the development, management, and conduct of ILSAs has grown from one to nine big players in the field. Their studies are presented on the Gateway, and they include CONFEMEN, the IEA, the IDB, the OECD, SEACMEQ, SEAMEO, UNICEF, UNESCO, and the World Bank.
Numbers of participating entities
Numbers of participating entities (e.g., countries, economies, regions) have increased considerably from 12 in the IEA’s First International Mathematics Study (FIMS) in 1964 up to 57 in PIRLS 2021, 66 in TIMSS 2023, and 81 in PISA 2022. At present, about 70% of the countries in the world are estimated to participate in ILSAs (Lietz, Cresswell, Rust, & Adams, 2017, p. 1).
Target populations
The populations investigated have been expanded so that ILSAs cover a broad range from preprimary to postsecondary. A large number of studies investigate students enrolled in primary and/or secondary schools and some also extend the investigation to include teachers, principals, and parents of the target age group (e.g., in ERCE, ICCS, ICILS, PASEC, PIRLS, PISA, REDS, SACMEQ, SEA-PLM, and TIMSS). Other studies focus on teachers in their own right, e.g., current primary teachers and their principals (TALIS) or future primary and lower secondary mathematics teachers and their educators (TEDS-M). ILSAs undertaken with individuals outside the school context can focus, for example, on young children (e.g., IELS, PRIDI, and TALIS Starting Strong) or adults (PIAAC, STEP).
Study domains
The domains which are the focus of investigation, regardless of whether the study includes an assessment or not, have evolved from mathematics in FIMS to cover a broad range of areas: mathematics or numeracy (ERCE, IELS, PASEC, PIAAC, PISA, SACMEQ, SEA-PLM ,TEDS-M, TIMSS) but then also science (ERCE, PISA, TIMSS), various aspects related to reading and language (ERCE, IELS, PASEC, PIAAC, PIRLS, PISA, PRIDI, SACMEQ, SEA-PLM, STEP,), civic education and global citizenship (ICCS, SEA-PLM), HIV/AIDS knowledge (SACMEQ), and, more recently, computer and information literacy and computational thinking (ICILS), creative thinking (PISA), problem-solving (PIAAC, PISA), and financial literacy (PISA). Some studies deal with socio-emotional and motor skills of young children (IELS, PRIDI), teachers, their teaching and teaching environments (PASEC, TALIS), or teacher’s pedagogical content knowledge (PASEC, TEDS-M).
From input toward output
A shift in focus from an input toward an output/outcomes orientation has occurred: Early reform efforts in education focused on concerns related to inputs and the challenges of ensuring equity in terms of school enrollments. However, with greater recognition of the effects of globalization and economic competitiveness and the greater concerns for equity of learning outcomes (e.g., in terms of what students know and can do), ILSAs are increasingly seen as a necessary condition for monitoring and understanding the outcomes of the significant investments that all nations make in education.
ILSAs also play a crucial role in monitoring Sustainable Development Goal (SDG) 4, aimed at enhancing global education quality under the UN Education 2030 Agenda. They provide essential cross-national data on educational achievements and contextual factors, helping to track progress particularly toward SDG Target 4.1., which seeks to ensure that by 2030, all children complete free, equitable, and quality primary and secondary education that leads to relevant and effective learning outcomes.
Measurement methodologies
Measurement methodologies have evolved considerably: ILSAs such as PISA, PIRLS, and TIMSS have their roots in the methodologies of the long-term trend assessment NAEP (National Assessment of Educational Progress) in the USA. Its methodologies have been adapted and extended to meet the challenges of these other studies which assess educational achievement beyond national boundaries, ensuring, for example, comparability of test items while operating with a growing number of participating countries and languages.
But ILSAs have made important progress in other methodological areas, such as sampling, instrument development and validation, or scaling (Lietz et al., 2017, p. 16). As far as analytical procedures are concerned, early studies employed classical item analysis for score calculation whereas most recent studies use item response theory (IRT) for analysis of cognitive outcome data.
Technical standards
Throughout the years, the organizations conducting ILSAs have developed technical standards describing minimum requirements, and the reports of results for each study cycle are typically accompanied by comprehensive technical documentation, which provides critical guidance for data interpretation and the implementation of secondary analyses. The provision of technical documents, labeled, for example, “Methods and Procedures” (in PIRLS and TIMSS), “Technical Report” (e.g., in ICCS, ICILS, PIAAC, PISA, and TALIS), or directly “Technical Standards” (SEA-PLM) is one of the selection criteria for inclusion of an ILSA here.
Future challenges
Regardless of the high level of excellence that ILSAs have achieved, there is still room for improvement. Important challenges identified at the start of the Gateway in 2017 (Lietz et al., pp. 17) have been addressed over time. While computer-based assessment (CBA) remained optional in some recent study cycles (e.g., ICCS 2022, TALIS 2018), other investigations have now transitioned to (almost) full CBA or digital formats (e.g., PISA 2022, TIMSS 2023, PIRLS 2026). To meet the needs and facilitate the participation of middle and low-income countries, the IEA initiated the LaNA study, while the OECD developed PISA for Development (PISA-D), which has since been incorporated into the regular PISA cycle, and then also TALIS +, which uses partially adapted instruments from the TALIS 2024 cycle.
Additionally, a few new studies have been launched to explore new assessment domains. For instance, in response to the immediate need, the IEA rapidly developed REDS to investigate how disruptions from the COVID-19 pandemic had affected teaching and learning. Meanwhile, the OECD has initiated PISA VET to assess the skills of learners completing initial vocational education and training programs.
Purposes
Major objectives of ILSAs, especially those undertaken in the school context, include improving education quality and equity, as well as serving the increasing demand worldwide for greater accountability for the investments made in educational provision. In general, ILSAs share common objectives that either explicitly or implicitly include one or more of the following elements:
- Provision of high-quality data to improve policymakers’ understanding of key school-based and non-school-based factors influencing teaching and learning
- Provision of high-quality data as a resource for identifying areas of concern and action and for preparing and evaluating educational reforms
- Development and improvement of the capacity of educational systems to engage in national strategies for educational monitoring and improvement (Wagemaker, 2014, pp.13)
ILSAs are typically organized as cross-national and cross-sectional studies, providing information about a population and area of interest for a specific point in time in the participating countries or regions so that the participating entities can learn from each other. However, comparisons across countries, and often across cultures, involve considerable challenges (Lietz et al., 2017). Therefore, measuring trends within countries is an even more important objective for many ILSAs; these are conducted at regular intervals (e.g., PISA every three years; SEA-PLM and TIMSS every four years; ICCS, ICILS, IELS, PASEC, PIRLS, and TALIS every five years) so that regularly participating countries can compare their own results over time and make informed decisions for improving their education systems. Discussions are also ongoing about longitudinal investigations, which have been conducted, for example, as optional specifications, such as in ICILS 2018 with the 'ICILS Teacher Panel 2020".
International large-scale assessments have also made major contributions to education research. They have contributed to educational theory, for example, in terms of model building and testing, and helped to build and strengthen a world-wide community of researchers in educational evaluation. The notion that educational reform and improvement rather than assessing and testing as such are the goals of ILSAs creates the imperative to ensure that the data gathered are readily accessible and used (Wagemaker, 2014, p.14). This Gateway supports these endeavors by facilitating the location of ILSA resources.
Dealing with differences
ILSAs share common objectives and features, but there are also differences. In addition to the above-mentioned differences in terms of the number of participating entities, domains under investigation, or target populations, the studies may vary in the following ways:
- Some ILSAs (e.g., ICCS, ICILS, IELS, PASEC, PIRLS, PISA, SEA-PLM, TALIS, TIMSS) adopt a cyclical, trend approach, repeating the study at more or less regular intervals with revised and improved versions of the previous data collection instruments and partly the same and partly different participating countries. Other ILSAs (e.g., PIAAC, STEP) operate in waves or rounds, applying the same set of instruments to different groups of countries at different points in time.
- Some studies (e.g., ERCE, PASEC, PIRLS, SACMEQ, SEA-PLM, TIMSS) use a curriculum-based approach, assessing student learning after a fixed period of schooling, whereas others (e.g., IELS, PISA, PIAAC, STEP) use an age- and skills-based approach (Wagemaker, 2014, p.14).
- The majority are conducted as international studies with participants from throughout the whole world. But there are also transnational studies with a more regional focus, such as ERCE (Latin America and the Caribbean), PASEC (Francophone Africa), SACMEQ (Anglophone Africa), PRIDI (Latin America), and SEA-PLM (Southeast Asia).
Numerous countries participate in more than one of these ILSAs, either in parallel or alternately, considering them complementary approaches to the investigation of learning outcomes and educational provision (Wagemaker, 2014, p. 19).
For the first time, these studies are gathered on a single platform: the ILSA Gateway. As the overarching service for educational ILSAs, the Gateway offers comprehensive information on all ILSAs in a standardized format while maintaining the characteristics of each individual study. These informational texts are complemented by hyperlinks, allowing for fast and easy access to documents, data, and other resources on the external study websites. The Gateway services are intended to encourage the exchange of knowledge and materials, to inspire future research, and to contribute to the further development of the ILSAs themselves.
Authors
Hans Wagemaker, IEA Executive Director (1997–2014)
Nathalie Mertes, ILSA Gateway Production Manager and Project Lead, IEA Hamburg
References
Lietz, P., Cresswell, J. C., Rust, K. F., & Adams, R. J. (2017). Implementation of large-scale education assessments. In P. Lietz, J. C. Cresswell, K. F. Rust, & R. J. Adams (Eds.), Wiley series in survey methodology. Implementation of large-scale education assessments (pp. 1–25). Chichester, United Kingdom: John Wiley & Sons.
Wagemaker, H. (2014). International large-scale assessments: From research to policy. In L. Rutkowski, M. von Davier, & D. Rutkowski (Eds.), Statistics in the social and behavioral sciences series. Handbook of international large-scale assessment. Background, technical issues, and methods of data analysis (pp. 11–36). Boca Raton: CRC Press.