INTERNATIONAL LARGE-SCALE ASSESSMENTS IN EDUCATION:
A BRIEF GUIDE
COMPASS
BRIEFS IN EDUCATION
NUMBER 10 SEPTEMBER 2020
SUMMARY
Internaonal large-scale assessments (ILSAs) are one
of the most important tools policymakers and other
educaonal stakeholders have to inform evidence-
based decision making for educaonal reform. Despite
this, and the widespread use of ILSA data, results are
somemes misunderstood or misinterpreted. Here, we
oer a brief guide to ILSAs and illuminate some of the
important dierences and commonalies within and
across studies, limitaons, and why they remain one
of our most signicant tools for educaon evaluaon
and reform. We focus on and compare the key studies,
approaches, and structure of our own organizaon,
the Internaonal Associaon for the Evaluaon of
Educaonal Achievement (IEA), with other ILSAs.
IMPLICATIONS
IEA conducted the rst ILSA study in 1958 but the mandates
for ILSAs have changed over their history and new mandates
are constantly developing. Most recently, focus has shied
to educaonal outcomes rather than inputs. ILSA results
should be understood in the context of their changing remit.
There are substanal dierences between organizaons
and approaches to assessments. From the review process to
decision-making and fees, fundamental dierences should
be accounted for when understanding results.
Key ILSAs do share a common methodology marked by high-
quality standards carefully dened to achieve each step in
the ILSA process and, further, the data are accompanied by
detailed supporng technical reports to assist in interpreng
and reporng results.
Limitaons of ILSAs need to be acknowledged when
understanding and reporng results. In parcular, care must
be taken when considering results from ILSAs as a holisc
quality measure of the educaon system.
Despite limitaons, the assessments are unique, monitoring
systems over me within a robust internaonal framework,
being largely independent of any single polical system,
and with data freely available to the public. When properly
understood and analyzed, the data from ILSAs provide
valuable opportunies to help inform policy decisions and
research.
Internaonal Associaon for the Evaluaon of Educaonal
Achievement (IEA), Amsterdam.
Website: www.iea.nl
Follow us:
@iea_educaon
IEAResearchInEducaon
IEA
2
The most well-known ILSAs are the core studies of the Internaonal Associaon for the Evaluaon of Educaonal
Achievement (IEA) and the Organisaon for Economic Co-operaon and Development (OECD):
INTRODUCTION
Internaonal large-scale assessments (ILSAs) of educaon
are empirical studies that assess educaonal abilies around
the world. The data are used in various ways to help inform
policymakers, educaonal researchers, and the general public.
However, despite the widespread use of ILSA data, how to
interpret and report results is oen misunderstood. Dierent
study results are somemes reported or, interpreted to mean
the same thing, yet there exists important dierences that need
to be accounted for when using results. As leaders in the eld of
internaonal large-scale assessment, our intenon for this brief
is to provide context for a beer understanding of ILSA results
and; how they should be interpreted and reported—we discuss
what ILSAs are, the history of their development, dierences
as well as commonalies in approaches, organizaon, and
methodology, and important limitaons—and to express why we
believe that ILSAs are unique, important, and relevant tools for
understanding educaonal systems and student achievement
around the world, and informing evidence-based change.
WHAT ARE ILSAs?
ILSAs assess student achievement in specic disciplines and
provide context for the results by collecng addional data
at the student level. Further contextual details at the teacher,
principal, and/or system levels may also be collected. To
provide stascally valid results, a representave sample of
schools (usually around 150 to 200 schools) are drawn from
each parcipang country or educaon system, and a group of
students are randomly drawn from within each of the sampled
schools, either by sampling enre classrooms or by sampling
students across classrooms.
BRIEF HISTORY
The rst ILSA, IEA’s Pilot Twelve-Country Study (Foshay et al.
1962), was launched by a group of researchers in 1958 (Husén
1983; IEA 2018). Scholars from various disciplines met at the
UNESCO Instute for Educaon in Hamburg (Germany) and
decided to launch a then exploratory study to test whether it
was possible to compare learning outcomes across a range of
dierent countries and cultures. They chose to assess student
achievement in mathemacs, assuming that it would be easiest
to translate into dierent languages and was thus, more likely
to result in valid comparisons across countries. Their aim
was to nd out what could be learned through internaonal
assessment, with the hope that countries could learn from
each other. As Torsten Husén phrased it: “In general terms,
internaonal studies such as this one can enable educaonalists
(and ulmately those responsible for educaonal planning and
policy making) to benet from the educaonal experiences
of other countries. It helps educaonalists to view their own
system of educaon more objecvely because for the rst me
many of the variables related to educaonal achievement had to
be quaned in a standardized way” (Husén 1967, pp. 13-14).
While the rst ILSAs were conducted by researchers with
quite minimal resourcing to sasfy academic interests in
invesgang educaon, by 1990 educaonal policymakers
had begun to realize that ILSAs could potenally provide useful
evidence-based data. However, their nancial support for ILSAs
demanded rapid outcomes. Where the rst academic study
reports were somemes launched up to eight years aer the
study was conducted (Anderson et al. 1989), this wider interest
led to a new pressure to publish results as soon as possible. An
addional consequence was an interest in measuring trends
in educaon systems, a challenge that IEA rose to meet with
TIMSS, rst conducted in 1995 and followed up by a second
Source: OECD 2019; IEA 2019
IEA Internaonal Computer and
Informaon Literacy Study (ICILS)
OECD Programme for Internaonal
Student Assessment (PISA)
IEA Trends in Internaonal Mathemacs
and Science Study (TIMSS)
IEA Progress in Internaonal
Reading Literacy Study (PIRLS)
IEA Internaonal Civic and
Cizenship Educaon Study (ICCS)
3
trend assessment four years later in 1999. The four year gap
was chosen because TIMSS assessed grade four and grade
eight students in 1995; in 1999, only grade eight students were
assessed, the aim being to compare two cohorts of grade eight
students’ results, and assess the same cohort of grade four
students four years later.
In the 1990s there was also an emerging internaonal interest
to have strong educaonal data to beer understand economic
growth. As such, the OECD, one of the leading internaonal
economic organizaons, placed a greater emphasis on
educaon and the measurement of educaonal outcomes, and
launched its own assessment study, PISA, in 2000. Previously
the OECD had only used other sources—including IEA studies
and their annual publicaon Educaon at a Glance—for their
educaonal work. However, the organizaon decided to focus
on the skills needed to operate in a modern economy rather
than on assessing what schools were teaching. Specically, PISA
claims to assess what the OECD believes 15-year-old students
enrolled in school should know in reading, mathemacs, and
science literacy; the assessment is conducted every three
years. Originally, the study focused on OECD countries, but an
increasing number of non-OECD countries now take part in the
assessment.
Toward the end of the 1990s and into the rst decade of
the 2000s, regional ILSAs were iniated including the LLECE
(Laboratoria Lanoamericano de Evaluación de la Calidad de la
Educaón) in Lan America, and PASEC (Programme d’Analyse
des Systèmes Educafs de la CONFEMEN) and SACMEQ
(The Southern and Eastern Africa Consorum for Monitoring
Educaonal Quality).
More recently, since 2015 when the Sustainable Development
Goals (SDGs) were declared by the UN, a new emphasis for
assessments has emerged. In contrast to the Millennium
Development Goals, the SDGs focus on educaonal outcomes
of educaon systems rather than on inputs like expenditure on
educaon. This results in all countries being urged to report on
the percentage of students reaching minimum prociency levels
and clearly constutes a new challenge for ILSAs. In this regard,
IEA is playing an acve role through the implementaon of
the Rosea Stone Project
1
, in collaboraon with the UNESCO
Instute for Stascs, LLECE, and PASEC. The objecve is to
develop a concordance table that translates scores resulng
from regional mathemacs and reading assessments with the
TIMSS and PIRLS scales.
DIFFERENT APPROACHES =
DIFFERENT ASSESSMENTS
Although the best known ILSAs feature a number of similaries,
there are also some substanal dierences that need to be
considered when comparing the results for dierent educaonal
systems. In the following table we have compiled some notable
dierences that test consumers need to be aware of when
comparing study results conducted by the IEA and OECD.
1. hps://www.iea.nl/studies/addionalstudies/rosea
ASPECT IEA STUDIES PISA
Study philosophy
Content selecon
Cohort selecon
Seeks to measure what is taught in schools
and the contexts of learning.
The curricula of the parcipang countries
are analyzed. Parcipang countries then
jointly develop the assessment framework
and test materials to ensure naonal
interests are acknowledged.
Samples are grade based (grades four,
eight, or twelve) to reect the structure of
curricula and to establish a direct link to
the subject teachers.
Seeks to measure selected acquired skills
of students towards the end of their
compulsory educaon.
OECD-selected experts determine the
skills that they think students should have
mastered for use later in life and assemble
the study instruments accordingly.
Samples are age based (15-year-olds for
PISA) to assess a generaon, independently
of their school pathways or grade
distribuon.
Table 1: Dierence in approaches between IEA studies and PISA
4
COMMON METHODOLOGIES
Despite dierences in organizaon and approach, ILSAs
share a quite similar set of procedures and methods for
implementaon that have been developed and rened over
the last 60 years, and have contributed to methodological
advances in regional ILSAs and naonal assessment programs
in many countries. Drawing from experse from around
the world, all major ILSAs mandate high-quality standards
carefully dened to achieve each step in the ILSA process
including: sampling rules, translaon processes, eld trial
procedures, and psychometric modelling. These methods
may appear quite complex to those unfamiliar with ILSAs,
but, for transparency, the data are accompanied by detailed
supporng technical reports and all ILSA data is made freely
available online for researchers to download. Further, both
IEA and OECD produce a large number of resources to help
promote the secondary analyses of the data. These secondary
analyses are possible and made rich by the presence of
contextual data. A recent example is ICILS where contextual
data helped to inform student digital literacy.
DIFFERENT ORGANIZATION
Beyond the studies themselves, the two organizaons—
IEA and OECD—dier in many regards (see Table 2).
Most fundamentally, their missions are not equivalent when it
comes to the relaonship between results and policymaking.
The scope of the OECD is larger than the eld of educaon,
originally grounded in the economic sector, and one OECD
mission is to give its members recommendaons in terms of
policies to be implemented. IEA, as an associaon grounded
in academic research in educaon, has no vocaon to dra
recommendaons to its members but rather, to provide
evidence on which each individual country can build adequate
policies regarding its own context. The following table
describes some of the organizaonal dierences between the
IEA and OECD.
ASPECT IEA OECD
Type of organizaon
Method of carrying
out ILSAs
Review process
Parcipaon
Decision making
Fees
Non-governmental associaon (with
naonal members).
Conducts studies cooperavely with partner
organizaons from its scienc network.
All study publicaons are rigorously
peer-reviewed.
Open to all countries. No requirement to
take part in studies.
Final content decided by the Naonal
Research Coordinators (NRC) of
parcipang countries, each of which has
an equal voice.
Each country makes an equal contribuon
to the internaonal coordinaon costs.
Governmental organizaon.
Iniates studies and tenders the studies
for each cycle. Separates out dierent
tasks (e.g., framework development,
sampling, test plaorm).
Wrien and reviewed in-house
and reviewed by board members of
parcipang countries.
The agship PISA study was originally
targeted to OECD member countries
and is now open to all economies.
The organizaon structure reects the
original centrality of the OECD membership
and vong is restricted to OECD members
and select partner economies.
Member countries pay dierent
contribuons based on their GDP.
Non-member parcipants pay a at fee.
Table 2: Organizaonal dierences between IEA and OECD
5
LIMITATIONS
There are some important limitaons to consider in
understanding and reporng ILSA results. Firstly, all ILSAs
are cross-seconal studies. This means that they measure
educaonal outcomes at one point in me for a specic
populaon. Further, although some ILSAs such as TIMSS, PIRLS,
and PISA report trends over me, they are not studies that follow
individual students. This makes it challenging, if not unfeasible,
to draw causal conclusions about student achievement and is a
clear limit of the data with many policymakers seeking specic
recommendaon for change to help improve their educaonal
system. Although some researchers seek to use advanced
models to establish causal relaonships, we personally
maintain that such models are based on strong assumpons
that are dicult, if not impossible, to achieve with ILSA data.
A second limitaon of ILSAs is that the studies are not
designed to measure individual students’ achievement nor
the results of individual schools but to reect the educaonal
results and relaonship with background informaon within
educaon systems. As such, the assessments are considered
low stakes for schools, teachers, and students. Nevertheless,
the low stakes aspect of the assessment has the advantage
of lowering tesng stress on students and schools. In fact,
ILSAs require a fairly short tesng me when compared to
high stakes assessments and are only administered to a small
representave poron of the populaon in the countries.
Finally, the content domains covered by ILSAs are not an
exhausve list of what is taught in schools. For example,
IEA’s TIMSS focuses on mathemacs and science curriculum
aainment at grades four and eight and, while crical subjects,
strong aainment in these subjects alone cannot be considered
to be a reliable measure of the overall health of an educaon
system. Some have cricized ILSAs as exerng undue inuence
on educaon policies, with naonal approaches perceived as
being replaced with a tendency to target naonal curricula
toward beer achievement in the subjects assessed by ILSAs.
Consequently, care must be taken when considering results
from ILSAs as a holisc quality measure of the educaon system,
rather the focus should be on the results as important indicators
of what students know and can do in specic subjects, and
how students naonally compare to their peers internaonally.
WHY ILSAs ARE IMPORTANT
Despite limitaons, ILSAs are vital tools for educaon
system improvement. The assessments are unique in
that the informaon provided by the data can be used to
monitor systems over me within a robust internaonal
framework, and they fall outside the governance of any
one country thus being largely independent of any single
polical system. Further, the data is available to the public
allowing researchers from around the world to explore the
data and make use of it for their own research quesons.
In many countries, research resulng from ILSAs has improved
our understanding of how educaonal systems operate,
informing policy decisions that are based on strong and reliable
evidence. Various impact studies have shown that ILSA results
have been used to support policymaking (e.g., Breakspear 2012;
Schwippert & Lenkeit 2012; Wagemaker 2013) and as reported
in the TIMSS and PIRLS Encyclopedias various educaonal
improvements have been smulated by evidence from ILSAs.
When properly understood and analyzed, the data from ILSAs
provide valuable opportunies to help inform policy decisions
as well as research into educaon system improvement.
6
Dr Dirk Hastedt is the Execuve Director of
IEA. He oversees IEA’s operaons, studies,
and services, and drives IEA’s overall strategic
vision. Moreover, he develops and maintains
strong relaonships with member countries,
researchers, policy makers, and other key
players in the educaon sector. Dr Hastedt
also serves as co-editor in chief of the IEA-
ETS Research Instute (IERI) journal Large-
scale Assessments in Educaon.
Dr Thierry Rocher was elected to the IEA Chair
posion at the 59th General Assembly meeng
in October 2018. Dr Rocher previously served
as a Standing Commiee member and General
Assembly representave for France. He is the
Head of the Oce for Student Assessment
(DEPP) at the Ministère de l’éducaon
naonale in France.
ABOUT THE AUTHORS
DR DIRK HASTEDT
DR THIERRY ROCHER
WHAT NEXT?
As ILSAs connuously modernize—most recently with a move
toward computer-based assessments—new methodological
opportunies and challenges await. This is why organizaons
conducng ILSAs are acvely promong research in the
eld of internaonal assessment. For example, IEA sponsors
academic journals, conferences, and themac reports for
outside researchers as well as employing its own research
team to help further developments in the eld of assessment.
More generally, in its renewed strategy, IEA is placing a
strong emphasis on research and innovaon, including,
for example, the promising topic of “process data”(i.e.,
digital traces le by students when passing an assessment).
Another focus of ILSA development is the exploraon of larger
and more complex dimensions, so-called “21st century skills.
IEA has recently launched a curriculum study (21CS MAP) which
aims to map these skills. This fundamental study will be based
on what is taught in schools (the intended and implemented
curricula) before developing a concrete assessment program.
CONCLUSION
In a growing interconnected world, we believe ILSAs can
help us learn from others and, through comparison, beer
understand ourselves. IEA works diligently on assisng the
policy and research community by training researchers and
policymakers on how to interpret and analyze data, by wring
and commissioning in-depth reports into the study ndings,
and by publishing quarterly briefs that are intended to be
short digesble summaries of interesng study results. Such
acvies are undertaken in support of IEA’s mission “to beer
understand educaon pracces, processes, and policies in
order to improve the quality of teaching and learning within
and across systems of educaon.All IEA studies and reports
follow the highest academic standards for social science
research, including complete transparency about the tesng
process and robust peer review for studies and reports.
7
Anderson, L.W., Ryan, D.W., & Shapiro, B.J. (1989). The IEA classroom environment study. Oxford, UK: Pergamon Press.
Breakspear, S. (2012). The Policy Impact of PISA: An Exploraon of the Normave Eects of Internaonal Benchmarking in School
System Performance. OECD Educaon Working Papers No. 71. Paris, France: OECD Publishing
Foshay, A.W., Thorndike, R.L., Hotyat, F., Pidgeon, D.A., & Walker, D.A. (1962). Educaonal achievements of thirteen-year-olds in
twelve countries: Results of an internaonal research project, 1959‚ 1961. Hamburg, Germany: UNESCO Instute for Educaon.
hps://unesdoc.unesco.org/ark:/48223/pf0000131437.
Husén, T. (1967). Internaonal study of achievement in mathemacs: a comparison of twelve countries. Vol. I–II. Stockholm, Sweden
and New York, NY: Almqvist & Wiksell and John Wiley.
Husén, T. (1983). An Incurable Academic: Memoirs of a Professor. Oxford, UK: Pergamon Press.
IEA. (2018). 60 years of IEA 1958–2018. Amsterdam, Netherlands: Author.
hps://www.iea.nl/index.php/publicaons/60-years-iea-1958-2018.
IEA. (2019). IEA studies [webpage]. hps://www.iea.nl/studies/ieastudies
Lockheed, M. (2011). Reecons on IEA from the perspecve of a World Bank ocial. In C. Papanastasiou, T. Plomp, & E.
Papanastasiou (Eds.), IEA 1958-2008: 50 years of experiences and memories, Vol 2. Nicosia, Cyprus: Cultural Center of the Kykkos
Monastery.
Lockheed, M., Prokic-Bruer, T. & Shadrova, A. (2015). The Experience of Middle-Income Countries Parcipang in PISA 2000-2015.
Paris, France: PISA/OECD Publishing.
OECD. (2019). PISA: Programme for Internaonal Student Assessment. hp://www.oecd.org/pisa/
Schwippert, K., & Lenkeit, J. (Eds.). (2012). Progress in reading literacy in naonal and internaonal context: The impact of PIRLS 2006
in 12 countries. Münster, Germany: Waxmann Verlag GmbH
Travers, K., & Westbury, I. (1989). The IEA Study of Mathemacs I: Analysis of the Mathemacs Curricula. Amsterdam, the Netherlands:
Internaonal Associaon for the Evaluaon of Educaonal Achievement (IEA).
Wagemaker, H. (2013). Internaonal large-scale assessments: From research to policy. In L. Rutkowski, M. von Davier, & D.
Rutkowski (Eds.), Handbook of internaonal large-scale assessment: Background, technical issues, and methods of data analysis, p. 11.
Florence, KY: Chapman and Hall CRC.
REFERENCES
ADDITIONAL RESOURCES
Hastedt, D. (2016, November 7). The history and development of internaonal assessments [Podcast]. FreshEd Podcast 49.
hp://www.freshedpodcast.com/dirkhastedt/.
Lockheed, M., & Wagemaker, H. (2013). Internaonal Large-Scale Assessments: Thermometers, whips or useful policy tools?
Research in Comparave and Internaonal Educaon, 8(3), 296–306.
Singer, J.D., Braun, H.I., & Chudowsky, N. (2018). Internaonal educaon assessments: Cauons, conundrums, and common sense.
Washington, DC: Naonal Academy of Educaon.
8
ABOUT IEA
The Internaonal Associaon for the Evaluaon of Educaonal Achievement,
known as IEA, is an independent, internaonal consorum of naonal
research instuons and governmental agencies, with headquarters in
Amsterdam. Its primary purpose is to conduct large-scale comparave
studies of educaonal achievement with the aim of gaining more in-depth
understanding of the eects of policies and pracces within and across
systems of educaon.
Thierry Rocher
IEA Chair
Dirk Hastedt
IEA Execuve Director
Andrea Neen
Director of IEA Amsterdam
Gina Lamprell
IEA Publicaons Ocer
Compass Editor
David Rutkowski
Indiana University
Please cite this publicaon as:
Rocher, T., & Hastedt, D. (2020, September). Internaonal large-scale assessments in educaon: a brief guide
IEA Compass: Briefs in Educaon No. 10. Amsterdam, The Netherlands: IEA
Copyright © 2020 Internaonal
Associaon for the Evaluaon of
Educaonal Achievement (IEA)
All rights reserved. No part of this
publicaon may be reproduced,
stored in a retrieval system or
transmied in any form or by any
means, electronic, electrostac,
magnec tape, mechanical,
photocopying, recording or
otherwise without permission in
wring from the copyright holder.
ISSN: 2589-70396
Copies of this publicaon can be
obtained from:
IEA Amsterdam
Keizersgracht 311
1016 EE Amsterdam
The Netherlands
Website: www.iea.nl
Follow us:
@iea_educaon
IEAResearchInEducaon
IEA