Obsolescence in the CS literature

Sep 07 2010 Published by under bibliometrics, scholarly communication

John Dupuis has pointed to a series of articles in various ACM venues about research in CS (computer science) and about how their literature is structured. Conference papers are sometimes more prestigious than journal articles, there’s been a proliferation of conferences, journals are too slow, conferences have too many papers from industry (I certainly do not agree), and so forth.

In the current Communications of the ACM, there’s another take on publications in CS, and one that might interest librarians.

Sjoeberg, D.I.K. (2010) Confronting the myth of rapid obsolescence in computing research. Communications of the ACM 53(9), 62-67. DOI: 10.1145/1810891.1810911

This is really one of a category of bibliometric articles – ones that study obsolescence. They answer the question of how far back do researchers go when citing. The idea is that articles older than that are getting dated and are less useful. The Journal Citation Reports from Thomson Reuters reports on the cited half-life of journals – how far back you have to go to get 50% of the references from that year of the journal. Math is notoriously longer – around 10 years – and areas of the life sciences are notoriously shorter – immunology being under 6 years. If you’ll recall, I briefly mentioned a study by Nan Butkovich* that essentially looked at the same thing, but at a local level.

Anyhow, the author used the JCR but also used the ACM Digital Library and the citations extracted there – not perfect but he got publication years for all but 2.7%. The values varied quite a bit by subdiscipline – the theoretical ones were higher – but were all in the middle of the road compared to other fields. Slightly shorter half-life than engineering (which also does conferences, but only the journal articles were looked at) and slightly longer half-life than physics.

The article goes in to a lot more detail. I recommend you take a peek!

* hmm, wonder if studying how various institutions differ from the average would vary only with subdiscipline or if it would be some indicator about the quality of access to older literature?

7 responses so far

  • [...] This post was mentioned on Twitter by Bora Zivkovic, Christina K. Pikas. Christina K. Pikas said: New blog post: http://tinyurl.com/2b6p9pw - Obsolescence in the CS literature [...]

  • lylebot says:

    Interesting, and not what I would have expected.

    May I ask why you don't agree that there are "too many papers from industry"? I am not sure where I stand on that, but I do think there are some fields in which industry papers with irreproduceable results (due to data being proprietary) have become very common, and I am certainly willing to entertain the idea that they are /too/ common.

    • Christina Pikas says:

      In Information Retrieval - the area of CS I follow most closely - a lot of the best research is done in industry. I'm sure it depends on the area.

      • lylebot says:

        Information retrieval is actually the area I was thinking of. Those of us in academia can't replicate research based on thousands of queries and millions of user clicks. Why should we accept those results independently of the possibility that they've been reviewed by someone else in industry that finds them plausible? Also consider that papers from industry often have renormalized measurements in order to hide certain information (I know this from my own stints in industry).

  • [...] Obsolescence in the CS literature [...]

  • Charles Early says:

    This reminds of another question I have about the CS literature. When I was at Stanford back in the 1980s technical reports seemed to be the primary medium of communication in computer science. They functioned like preprints in high energy physics, with all of the major research institutions (including industry) on one another's distribution lists. (Actually some researchers said that by the time a tech report came out it was already old news and archival, since they already knew about it through the invisible college, but that probably only works if you're part of the inner circle). A few months ago I was working on a guide to CS/IT information resources, and I discovered that CS technical reports had all but vanished sometime while my back was turned. NCSTRL, which was was sort of an arXiv for CS , seems to have gone out of business several years ago without anybody noticing. (There's a historical collection at http://historical.ncstrl.org/, which seems to be down at the moment). arXiv has a CS section now, but I don't know how big a role it plays. I got the impression talking to one of our CS people just now that the field has become more fragmented and there's nothing that has really taken the place of tech reports. I'd be interested in thoughts from people who are closer to CS than I am.

    • Christina Pikas says:

      It does seem like technical reports have all but disappeared from CS. What technical reports and preprints out there seem to be on individual websites, aggregated (if at all) by citeseer. It doesn't seem like there is really any effort to centralize these like in Economics. Maybe some other CS librarians could chime in.