Advertisement

Citation Inadequacy Via Databanks

Any research paper that contained a reference list consisting only of the titles of the journals consulted and not their years of publication, volume and page numbers, and the names of the authors would surely be rejected out of hand by editor and referees alike. Right? Not so. Increasing numbers of papers are being submitted with references in precisely this form, and they are being accepted without question. The authors of the papers, the editors of the journals and the referees all seem unawa

By | March 23, 1987

Any research paper that contained a reference list consisting only of the titles of the journals consulted and not their years of publication, volume and page numbers, and the names of the authors would surely be rejected out of hand by editor and referees alike. Right?

Not so. Increasing numbers of papers are being submitted with references in precisely this form, and they are being accepted without question. The authors of the papers, the editors of the journals and the referees all seem unaware that this is taking place. And the probable explanation for this is that the data being cited are contained not in the pages of learned journals, but in the files of certain scientific databanks. Authors are acknowledging the databanks, but not citing the workers who originally deposited the data. Unless this problem is resolved soon, those currently depositing the data may become reluctant to do so.

It could be argued that scientists funded by the public purse should make their findings publicly available and should not expect any reward beyond that of knowing that they are serving science. Unfortunately, science is not structured to encourage this altruistic attitude. A convoluted mechanism has evolved whereby published research findings can be transformed into citations in reference lists of subsequent papers and thence, through peer review and funding bodies, into grant renewals to support the generation of further findings. In this context, publication has a value not only to receivers of the information, but also to its providers, and it is this value to the providers which current practices in the use of some databanks fail to recognize.

It can be assumed that no scientist sets out deliberately to ignore the work of colleagues, so why has this practice developed?

The Role of Databases

Scientific databases sometimes serve merely as adjuncts to the traditional scientific media as depositories for information such as seismic data from geophysical surveys, three-dimensional structures of organic chemicals or nucleotide sequences of DNA, which would not be usefully communicated through the printed page. Much of the data is complex and costly to produce. Databanks contain both data and bibliographic files and the databank compilers intend that citation of the original work should be included in any subsequent publications. Unfortunately, this does not always happen, perhaps because the two parts of the file are easily separable, to allow software to handle the data.

Furthermore, data are often passed by the original recipient to colleagues who have little idea of (and may care less about) the origin of the data. As laboratory microcomputers become more powerful and software more sophisticated, computerized data will become useful to a growing number of scientists. The chances for unwitting citation obliteration will be increased accordingly.

Another complicating factor is that the databanks are often run on a not-for-profit basis, and depend on public funds. To try to ensure that its grant is renewed, the Protein Data Bank in Brookhaven, Ill., for instance, requests that all users of its services should reference the paper of Bernstein et al. in the Journal of Molecular Biology (vol. 112, pp. 535-542, 1977) which describes the databank. This is justifiable as it allows the utility of the databank to be documented, but it is not, as some researchers obviously believe, sufficient to reference all the information the databank contains.

The solution ultimately lies in the hands of individual researchers who must ensure that they properly acknowledge the work of their colleagues. Initially, however, the databank administrators must take the lead. Even at the risk of causing offense to those applying for data, administrators must make clear that full citation is essential in all publications using the data, including those produced by colleagues of the applicant. The system could be policed if the databanks stipulated that they should receive reprints of all papers that draw on the data.

As electronic data collection and processing becomes more common, and as the learned journals become increasingly reluctant to fill their pages with repetitive data, it is likely that databanks will play an ever-larger part in the dissemination of scientific information. But their administrators and users must ensure, as the learned journals have traditionally done, that credit is given where credit is due.

Hodgson is editor of Trends in Biotechnology, 68 Hills Rd., Cambridge CB2 1LA, UK.

Advertisement

Follow The Scientist

icon-facebook icon-linkedin icon-twitter icon-vimeo icon-youtube
Advertisement

Stay Connected with The Scientist

  • icon-facebook The Scientist Magazine
  • icon-facebook The Scientist Careers
  • icon-facebook Neuroscience Research Techniques
  • icon-facebook Genetic Research Techniques
  • icon-facebook Cell Culture Techniques
  • icon-facebook Microbiology and Immunology
  • icon-facebook Cancer Research and Technology
  • icon-facebook Stem Cell and Regenerative Science
Advertisement
Advertisement
Life Technologies