A study based on the use of the word data in scholarly articles

Data is one of the most used terms in scientific vocabulary. This article of Frédérique Bordignon (École des Ponts, Marne-la-Vallée) et Marion Maisonobe (Géographie-cités, CNRS), published in Quantitative Science Studies, focuses on the relationship between data and research by analyzing the contexts of occurrence of the word data in a corpus of 72,471 research articles (1980–2012) from two distinct fields (Social sciences, Physical sciences).

The aim is to shed light on the issues raised by research on data, namely the difficulty of defining what is considered as data, the transformations that data undergo during the research process, and how they gain value for researchers who hold them. Relying on the distribution of occurrences throughout the texts and over time, it demonstrates that the word data mostly occurs at the beginning and end of research articles. Adjectives and verbs accompanying the noun data turn out to be even more important than data itself in specifying data. The increase in the use of possessive pronouns at the end of the articles reveals that authors tend to claim ownership of their data at the very end of the research process. Our research demonstrates that even if data-handling operations are increasingly frequent, they are still described with imprecise verbs that do not reflect the complexity of these transformations.

The word data serves as a rhetorical base and draws on the context for its meaning, relying on the properties conveyed by adjectives and verbs associated to it. Adjectives and verbs accompanying the noun data turn out to be even more important than data itself in specifying what data are at stake. And to echo Gitelman (2013), we can say that while data can never be raw, the word data is, and it only serves as a rhetorical basis, as long as the context and mainly adjectives have not contributed to achieve its potential with the properties they convey.

Download Frédérique Bordignon, Marion Maisonobe; Researchers and their data: A study based on the use of the word data in scholarly articles. Quantitative Science Studies 2022; 3 (4): 1156–1178.