Opinion: Mutations of citations

Just like genetic information, citations can accumulate heritable mutations

| 3 min read

Register for free to listen to this article
Listen with Speechify
0:00
3:00
Share
Few scientific studies have attracted as much attention as the "Cleavage of structural proteins during the assembly of the head of bacteriophage T4", linkurl:published 40 years ago;http://www.nature.com/nature/journal/v227/n5259/abs/227680a0.html by Uli Laemmli. Referenced an estimated 2 x 105 times (about 15 daily citations), it is unavoidable that the article is often cited incorrectly. Indeed, database searches reveal more than 600 variations of the correct reference linkurl:(ISI database).;http://wok.mimas.ac.uk/
Figure 1A. Sequence alignment of the correct citation (#1) and a selection of
citation variants (#2-10), comprising the author's name, journal, volume, first
page number and year of publication. Sequence identity is indicated in grey.
Click linkurl:here;http://images.the-scientist.com/content/images/general/figure1a-1.jpg to see a larger version of this image.
Wrong citations (WCs) contain errors in the sequence of letters and numbers that make up the correct citation, including the name of the author or journal, the volume and page numbers or the year of publication (see examples listed in Fig. 1A). The omission, addition or replacement of one character on the keyboard by another lead to variations that can be described in genetic terms and classified as deletions, insertions, point mutations and inversions of characters, or as complete nonsense mutations.
Figure 1B. Incidence of spontaneous WCs in which
the page number is incorrect (Laemmli, U.K. (1970)
Nature 227, 600 through 700). The most common
errors are inversions (680 to 608) or the replacement
of a number with one of similar shape (680 to 630) or
value (680 to 681). Note that the number of correct
citations (estimated at 2 x 105) exceeds the
capacity of the ISI database (216 = 65536 'cytes').
Click linkurl:here;http://images.the-scientist.com/content/images/general/figure1b-1.jpg to see a larger version of this image.
While many citation variants are unique, others are found hundreds of times (see ISI database and examples in Fig. 1A). Which, then, are the principles that govern the distribution of WCs? The incidence of a WC can be explained by the likelihood that a certain character is mixed up with another character. For example, the shape of the number 8 is more similar to a 3 than to a 2; hence these spontaneous events happen at different rates (Fig. 1B). Nonetheless, when searching for incorrect references of Laemmli's article on ISI, citations in which the page number deviates by one are much more common than those with a similar alteration of the year (> 10 fold, Fig. 1A). In this case WCs do not occur in a purely stochastic fashion, since the year bears significance to the typing scientist, thus increasing his proofreading activity. Other WCs, on the other hand, are more frequent than one might expect from their unusual sequence (e.g. #10 in Fig. 1A, which has appeared 11 times since 1983). Since these are often found in publications that cite one another it seems safe to assume that they represent inherited WCs (Fig. 1C).
Figure 1C. Tracing of WCs to an ancestor from 1983
(#10 from Fig. 1A, occurrences 1-9 are identified
by research location and year). Inherited WCs are
generally transmitted between overlapping groups of
scientists within the same institution (boxes) or with
shared research interests (dashed lines). Lineages
are easily identified in articles that cite a previous paper
containing the WC (black lines), although this may involve a
missing link that does not contain the WC itself (e.g. 4 to 7).
Click linkurl:here;http://images.the-scientist.com/content/images/general/figure1c-1.jpg to see a larger version of this image.
In summary, citation variants arise through a variety of mechanisms similar to those described by molecular genetics. They are heritable between scientists and offer exciting insights into the transfer of knowledge. The high incidence of wrong citations reflects the fact that the contained information is to a certain extent redundant and may thus tolerate many mutations. However, it is possible that in the future the number of wrong citations can be minimised by using reference software tools - provided that the database entries are correct in the first place.Christian G. Specht is a neurobiologist working on learning & memory and currently based at the ENS in Paris.Editor's note (October 20): This article generated some online discussion, prompting a response from the author linkurl:here.;http://www.the-scientist.com/news/display/57698/
**__Related stories:__***linkurl:Online access = more citations;http://www.the-scientist.com/blog/display/55437/
[19th February 2009]*linkurl:More articles, fewer citations;http://www.the-scientist.com/blog/display/54839/
[18th July 2008]*linkurl:A new proposal for citation data;http://www.the-scientist.com/blog/display/54402/
[4th March 2008]
Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member? Login Here

Meet the Author

  • Christian G. Specht

    This person does not yet have a bio.
Share
Image of a woman in a microbiology lab whose hair is caught on fire from a Bunsen burner.
April 1, 2025, Issue 1

Bunsen Burners and Bad Hair Days

Lab safety rules dictate that one must tie back long hair. Rosemarie Hansen learned the hard way when an open flame turned her locks into a lesson.

View this Issue
Conceptual image of biochemical laboratory sample preparation showing glassware and chemical formulas in the foreground and a scientist holding a pipette in the background.

Taking the Guesswork Out of Quality Control Standards

sartorius logo
An illustration of PFAS bubbles in front of a blue sky with clouds.

PFAS: The Forever Chemicals

sartorius logo
Unlocking the Unattainable in Gene Construction

Unlocking the Unattainable in Gene Construction

dna-script-primarylogo-digital
Concept illustration of acoustic waves and ripples.

Comparing Analytical Solutions for High-Throughput Drug Discovery

sciex

Products

Green Cooling

Thermo Scientific™ Centrifuges with GreenCool Technology

Thermo Fisher Logo
Singleron Avatar

Singleron Biotechnologies and Hamilton Bonaduz AG Announce the Launch of Tensor to Advance Single Cell Sequencing Automation

Zymo Research Logo

Zymo Research Launches Research Grant to Empower Mapping the RNome

Magid Haddouchi, PhD, CCO

Cytosurge Appoints Magid Haddouchi as Chief Commercial Officer