Perhaps no other word better illustrates the extent to which questions of usage are often largely a matter of fashion. In Latin, data is of course a plural, and until fairly recent times virtually all authorities insisted, often quite strenuously, that it be treated as such in English. Thus "The data was fed into a computer program known as SLOSH" (New Yorker) should be "The data were fed..."

The problem is that etymology doesn't always count for much in English. If it did, we would have to write, "My stamina aren't what they used to be" or "I've just paid two insurance premia." For centuries we have been adapting Latin words to fit the needs and patterns of English. Museums, agendas, stadiums, premiums, and many others are freely, and usually unexceptionably, inflected on the English model, not the Latin one.

Indeed, many users of English show an increasing tendency to treat all Latin plurals as singulars, even those that have traditionally been treated as plural, most notably criteria, media, phenomena, strata, and data. With the first four of these the impulse is probably better resisted, partly as a concession to convention, but also because a clear and useful distinction can be made between the singular and plural forms. In stratified rock, for instance, each stratum is clearly delineated. In any list of criteria, each criterion is distinguishable from every other. Media suggests --- or ought to suggest --- one medium and another medium and another. In each case the elements that make up the whole are invariably distinct and separable.

But with data such distinctions are much less evident. This may be because, as Professor Randolph Quirk has suggested, we have a natural inclination to regard data as an aggregate; that is, as a word in which we perceive the whole more immediately than the parts. Just as we see a bowlful of sugar as a distinct entity rather than as a collection of granules (which is why we don't say, "Sugar are sweet"), we tend to see data as a complete whole rather than one datum and another datum and another. In this regard it is similar to news (which some nineteenth-century users actually treated as a plural) and information.

The shift is clearly in the direction of treating data as a singular, as The New Yorker and several other publications have decided to do. Personally, and no doubt perversely, I find that I have grown more attached to data as a plural with the passage of time. I think there is a certain elegance and precision in "More data are needed to provide a fuller picture of the DNA markers" (Nature) than "The data by itself is vacuous" (New York Times). But that is no more than my opinion.

Whichever side you come down on, it is worth observing that the sense of data is generally best confined to the idea of raw, uncollated bits of information, the sort of stuff churned out by computers, and not extended to provide a simple synonym for facts or reports or information, as it was in this New York Times headline: "Austria magazine reports new data on Waldheim and Nazis." The "data" on inspection proved to be evidence and allegations-words that would have more comfortably fit the context, if not the headline space.

--- From Bryson's Dictionary of
Troublesome Words

Bill Bryson
©2002, Broadway Books
Send us e-mail


Go Up     Go Home

Go to the most recent RALPH