A comparison of kernel equating methods based on Neat design

Periodical
Eurasian Journal of Educational Research
Year
2019
Issue number
82
Page range
27-44
Access date
November 20, 2019
Relates to study/studies
TIMSS 2015

A comparison of kernel equating methods based on Neat design

Abstract

Problem Statement: Equating can be defined as a statistical process that allows modifying the differences between test forms with similar content and difficulty so that the scores obtained from these forms can be used interchangeably. In the literature, there are many equating methods, one of which is Kernel equating. Trends in International Mathematics and Science Study (TIMSS) aims to find out the knowledge and skills gained by the fourth and eighth-grade students in the fields of mathematics and science. TIMSS have different test forms, and these forms are equated through common items.

Purpose of the Study: This research aimed to compare the equated score results of the Kernel equating (KE) methods, which are chained, and post-stratification equipercentile and linear equating methods under NEAT design.

Methodology: TIMMS Science data were used in this study. The study sample consisted of 865 eighth-grade examinees who were given the Booklets 1 and 14 during the TIMSS application in Turkey. There were 39 items in Booklet 1, and 38 items in Booklet 14. Firstly, descriptive statistics were calculated and then the two Booklets were equated according to NEAT design based on Kernel chained, Kernel post-stratification equipercentile, and linear equating methods. Secondly, the equating methods were evaluated according to some criteria such as DTM, PRE, SEE, SEED, and RMSD.

Findings and Results: It was seen that results based on equipercentile and linear equating methods were consistent with each other, except for a high range of the score scale. PRE values demonstrated that KE equipercentile equating methods better matched with the discrete target distribution Y, and distribution of SEED revealed that KE equipercentile and linear methods were not significantly different from each other according to DTM.