Detecting differential item functioning across age groups of children on the Turkish receptive language test

Erol Korkmaz, Habibe Tuğba
Stark, Stephan
Kazak Berument, Sibel
Güven, Ayşe Gül
This study investigated the use of differential item functioning (DIF) methods for examining the measurement invariance of items in the Turkish Receptive Language Test for children. Two groups of children differing in age were compared. DIF analyses were conducted using Lord’s chi-square test, the likelihood ratio test, and the differential functioning of items and tests (DFIT) method. Overall, 5 out of 38 items were consistently identified as having DIF by the various parametric DIF detection methods. Because the directions of DIF varied across items, the net effect of DIF at the scale level was small, as indicated by comparisons of test characteristic curves, as well as item response theory based effect size statistics, which showed that the observed mean differences across age groups were due mainly to impact. Potential explanations for the consistencies and variations in items detected across DIF methods, as well statistical versus practical significance of DIF results are discussed.
International Journal of Educational and Psychological Measurement


A micro-analytic investigation into EFL teachers' language test item reviewing interactions
Can, Hümeyra; Hatipoğlu, Çiler; Department of English Language Teaching (2020-9)
This study brings an interactional perspective to the construction of syllabus-based language tests and the stage of item reviewing (IR) in particular by using Conversation Analysis (CA). Drawing on a corpus of video-recordings of IR sessions (25 hours) in an English preparatory school at a state university in Turkey, it investigates how EFL teachers review language test items prepared for their students in and through interaction with the item writer who is one of the teachers assigned in the testing offic...
The Effects of Test Length and Sample Size on Item Parameters in Item Response Theory
Sahin, Alper; ANIL, DUYGU (2017-02-01)
This study investigates the effects of sample size and test length on item-parameter estimation in test development utilizing three unidimensional dichotomous models of item response theory (IRT). For this purpose, a real language test comprised of 50 items was administered to 6,288 students. Data from this test was used to obtain data sets of three test lengths (10, 20, and 30 items) and nine different sample sizes (150, 250, 350, 500, 750, 1,000, 2,000, 3,000 and 5,000 examinees). These data sets were the...
A study of the predictive validity of the Başkent University English proficiency exam through the use of the two-parameter IRT model's ability estimates
Yapar, Taner; Berberoğlu, Halil Giray; Department of Educational Sciences (2003)
The purpose of this study is to analyze the predictive power of the ability estimates obtained through the two-parameter IRT model on the English Proficiency Exam administered at Baskent University in September 2001 (BUSPE 2001). As prerequisite analyses the fit of one- and two-parameter models of IRT were investigated. The data used for this study were the test data of all students (727) who took BUSPE 2001 and the departmental English course grades of the passing students. At the first stage, whether the ...
Validity of science items in the student selection test in Turkey
Uygun , Nazlı; Berberoğlu, Halil Giray; Department of Elementary Science and Mathematics Education (2008)
This thesis presents content-related and construct-related validity evidence for science sub-tests within Student Selection Test (SST) in Turkey via underlying the content, cognitive processes, item characteristics, factorial structure, and group differences based on high school type. A total number of 126,245 students were present in the research from six type of school in the data of SST 2006. Reliability Analysis, Item Analysis, Principle Component Analysis (PCA) and one-way ANOVA have been carried out t...
The preditive validity of Başkent University proficiency exam (BUEPE) through the use of the three-parameter IRT model's ability estimates
Yeğin, Oya Perim; Berberoğlu, Halil Giray; Department of Educational Sciences (2003)
The purpose of the present study is to investigate the predictive validity of the BUEPE through the use of the three-parameter IRT model̕s ability estimates. The study made use of the BUEPE September 2000 data which included the responses of 699 students. The predictive validity was established by using the departmental English courses (DEC) passing grades of a total number of 371 students. As for the prerequisite analysis the best fitted model of IRT was determined by first, checking the assumptions of IRT...
Citation Formats
H. T. Erol Korkmaz, S. Stark, S. Kazak Berument, and A. G. Güven, “Detecting differential item functioning across age groups of children on the Turkish receptive language test,” International Journal of Educational and Psychological Measurement, pp. 81–94, 2012, Accessed: 00, 2021. [Online]. Available: