Advanced Search

Journal Navigation

Journal Home

Subscriptions

Archive

Contact Us

Table of Contents

Sign In to gain access to subscriptions and/or personal tools.
Language Testing
This Article
Right arrow Full Text (PDF)
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Right arrow Citing Articles via Scopus
Google Scholar
Right arrow Articles by Brown, J. D.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?

The relative importance of persons, items, subtests and languages to TOEFL test variance

James Dean Brown

University of Hawaii at Manoa, brownj{at}hawaii.edu

The purpose of this project was to explore the relative contributions to TOEFL score dependability (which is analogous to classical theory reliability) of various numbers of persons, items, subtests, languages and their various interactions. To these ends, three research questions were formulated: (1) What are the characteristics of the distributions, and how high are the classical theory reliability estimates for the whole test and its subtests? (2) For each of the 15 languages, what are the relative contributions to test variance of persons, items, subtests and their interactions? (3) Across all 15 languages, what are the relative contributions to test variance of persons, items, subtests and languages, as well as their various interactions?

The study sampled 15 000 test takers, 1000 each from 15 different language backgrounds, from the total of 24 500 participants in the TOEFL generic data set which itself was a sample from the May 1991 worldwide administration of the TOEFL. The test was administered under normal operational conditions and included all three subtests: (1) Listening Comprehension, (2) Structure and Written Expression, and (3) Vocabulary and Reading Comprehension.

The analyses included descriptive statistics, classical theory reliability estimates, and a series of generalizability studies conducted to isolate the variance components due to persons, items, subtests and languages, and their effects on the dependability of the test. Unlike previous research, the results here indicate that, when considered in concert with other important sources of variance (persons, items and subtests), language differences alone account for only a very small proportion of TOEFL test variance. These results should prove useful to test developers and researchers interested in the relative effects of such factors on test design.

Language Testing, Vol. 16, No. 2, 217-238 (1999)
DOI: 10.1177/026553229901600205


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati   Add to Twitter Twitter    What's this?


This article has been cited by other articles:


Home page
Language TestingHome page
Xiaoming Xi
Evaluating analytic scoring for the TOEFL(R) Academic Speaking Test (TAST) for operational use
Language Testing, April 1, 2007; 24(2): 251 - 286.
[Abstract] [PDF]


Home page
Language TestingHome page
M. L. Abbott
A confirmatory approach to differential item functioning on an ESL reading assessment
Language Testing, January 1, 2007; 24(1): 7 - 36.
[Abstract] [PDF]


Home page
Language TestingHome page
S. Zhang
Investigating the relative effects of persons, items, sections, and languages on TOEIC score dependability
Language Testing, July 1, 2006; 23(3): 351 - 369.
[Abstract] [PDF]


Home page
Language TestingHome page
M. Kim
Detecting DIF across the different language groups in a speaking test
Language Testing, January 1, 2001; 18(1): 89 - 114.
[Abstract] [PDF]


Home page
Language TestingHome page
L. F. Bachman
Modern language testing at the turn of the century: assuring that what we count counts
Language Testing, January 1, 2000; 17(1): 1 - 42.
[Abstract] [PDF]