Advanced Search

Journal Navigation

Journal Home

Subscriptions

Archive

Contact Us

Table of Contents

Sign In to gain access to subscriptions and/or personal tools.
Language Testing
This Article
Right arrow Free Full Text (Free PDF) Free
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Right arrow Citing Articles via Scopus
Google Scholar
Right arrow Articles by McCarthy, P. M.
Right arrow Articles by Jarvis, S.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?

vocd: A theoretical and empirical evaluation

Philip M. McCarthy

University of Memphis, USA, pmmccrth{at}memphis.edu

Scott Jarvis

Ohio University, USA

A reliable index of lexical diversity (LD) has remained stubbornly elusive for over 60 years. Meanwhile, researchers in fields as varied as stylistics, neuropathology, language acquisition, and even forensics continue to use flawed LD indices — often ignorant that their results are questionable and in some cases potentially dangerous. Recently, an LD measurement instrument known as vocd has become the virtual tool of the LD trade. In this paper, we report both theoretical and empirical evidence that calls into question the rationale for vocd and also indicates that its reliability is not optimal. Although our evidence shows that vocd's output (D) is a relatively robust indicator of the aggregate probabilities of word occurrences in a text, we show that these probabilities — and thus also D — are affected by text length. Malvern, Richards, Chipere and Durán (2004) acknowledge that D (as calculated by vocd's default method) can be affected by text length, but claim that the effects are not significant for the ranges of text lengths with which they are concerned. In this paper, we explain why D is affected by text length, and demonstrate with an extensive empirical analysis that the effects of text length are significant over certain ranges, which we identify.

Language Testing, Vol. 24, No. 4, 459-488 (2007)
DOI: 10.1177/0265532207080767


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati   Add to Twitter Twitter    What's this?


This article has been cited by other articles:


Home page
Written CommunicationHome page
D. S. McNamara, S. A. Crossley, and P. M. McCarthy
Linguistic Features of Writing Quality
Written Communication, January 1, 2010; 27(1): 57 - 86.
[Abstract] [PDF]


Home page
Applied LinguisticsHome page
J. M. Norris and L. Ortega
Towards an Organic Approach to Investigating CAF in Instructed SLA: The Case of Complexity
Applied Linguistics, December 1, 2009; 30(4): 555 - 578.
[Abstract] [Full Text] [PDF]


Home page
Applied LinguisticsHome page
G. Yu
Lexical Diversity in Writing and Speaking Task Performances
Applied Linguistics, June 4, 2009; (2009) amp024v1.
[Abstract] [Full Text] [PDF]