Archive for the '[Information&Communication]' category

I wouldn't have caught the JSTOR issues

Aug 31 2010 Published by under [Information&Communication], publishing

It's an ongoing thing around here about how our vendors need to test their products more and take input from the librarians and end users.  An update JSTOR made very recently is an example of how they need to ask a diverse set of users.

My biases: I come from a research institution with a large collections budget and I feel very strongly that users should start with research databases that are indexed and topical for subject based searches, not start with digital libraries. Digital libraries house the full text - now once you're there if you get recommendations or what not, fine.

So JSTOR - and they are very, very quickly fixing this - made it so a search done on their site would bring back results not necessarily available or accessible to the user. In other words, it doesn't default to show only subscribed items. Now, if this were a research database, it would pop up an open url resolver link so you could look for an owned/licensed copy. JSTOR didn't offer that either, so users at smaller institutions (or ones that don't license all of the collections) learned of interesting articles, but then were offered an opportunity to purchase them with a credit card, not assistance in finding a copy their library had already purchased. Meredith Farkas describes this.

If they had asked me? Neither of these things would have come up. First, we own most things that come back in the search (particularly in STEM fields). Second, I wouldn't search on JSTOR anyway, I search in Inspec, Compendex, Web of Science, Aerospace & High Tech, MathSciNet, etc., and then I link out using our open URL resolver to get to JSTOR. From time to time for my own interests, I'll browse the TOC there, but that's maybe once every few months.

So it's not just getting a librarian to give you feedback, it's getting a diverse group of librarians to give you feedback.

Oh, and the problems Wiley caused themselves?  Well... apparently they didn't remember when they fixed this exact issue before. So you have to listen, respond appropriately, and then remember not to un-do!

Comments are off for this post

Citation Practices - see DrugMonkey

DrugMonkey has a fascinating post on citation practices. It's fascinating because all of the comments have come up in some research article or other that I've read, but many of the commenters don't believe the other commenters and many readers may be shocked/surprised at the whole thing.

You see, we talk a good game about the Mertonian norms, but what we might do in practice is sometimes quite different. Anyway, the thread is worth a look. Here are some links to things I've written on citing practices in science:

One response so far

Print collections in math

Ever heard the library is the mathematician’s laboratory? (cited many places including here, pdf) Mathematicians do use the library and they do use older literature more than some fields. Specialized math librarians often work quite closely with the researchers to develop the collection. Being a specialized math librarian is a dying breed as branch libraries are being closed to save money and math collections are being migrated to big general science libraries. Also in most research collections, there’s a huge push to go electronic only and to move the print collections off site (or to weed them) to provide more space for group work and studying.

So how do you balance the needs of this special group of users with the push from administration?  I actually don’t know*, but there has been a fascinating thread on the mailing list of the Physics-Astronomy-Math division of SLA.

It started with Debra asking if anyone had committed to maintaining a set number of linear feet of math collection. Here are some points pulled from the answers:

  • no way- we’re trying to go online all the way!
  • younger math researchers are actually ok with electronic access, and the things we have off site we’ll scan for them and deliver, so it’s actually quite convenient
  • math needs more monographs than other fields and it’s very common to chain using citations so a big browsing collection is important
  • one institution doesn’t send any math offsite, but this was part of an agreement when the math branch library was closed
  • Nan from Penn State did a study so they could keep 90% of what their mathematicians cited. The first time she did the study she needed to keep 40 years and the second time she needed to keep 45 years of the collection on site. (there’s more to it, I’m looking forward to seeing her article whenever the next issue of Issues in Science and Technology Librarianship comes out.)
  • it’s only pure math that is particular about print – other math areas are not as concerned. There’s a lot of serendipity and a lot of browsing.
  • some departments want to approve what items are sent off or weeded on an item by item basis.
  • current library catalogs are not adequate replacement for browsing full shelves, so that’s one reason to keep the print on site.
  • don’t forget that information needs are cyclical so to set the used in x years too short, you’ll have big problems. Also don’t forget the grad students and outliers.
  • keep early, classic textbooks that have good explanations
  • if the only equipment the mathematicians get is pencil and paper, give them some slack for wanting books!
  • no one reads math on the computer, they might want it online, but then they print to read
  • requesting something from another location or offsite adds a delay and slows the whole process
  • if things really aren’t being used when they are close by, then they won’t be missed off site!
  • some users are fine with electronic access, it really might depend on your users!

I wonder if checkouts are ever a good metric since a lot of this stuff might be used within the library, some photocopies made, and then returned to the shelf. We had a very helpful mathematician who always just looked stuff up standing in the reference section. Consequently, no circulation and no proof of usage!**

These math librarians are great mentors with lots of awesome advice. I highly recommend this list for any librarians in physics, astro, math or cs.

Update: Nan's article is out (open access)
Butkovich,N.J. (2010) How Much Space Does a Library Need? Justifying Collections Space in an Electronic Age. Issues in Science and Technology Librarianship, 62

* Our mathematicians are not into “pure math” – but applied and statistics. They often publish in SIAM, IEEE, and public health publications and do use online tools.

** No, I don’t blame him for us losing our entire print collection… always a sore point.

One response so far

Let’s Help ScienceBlogging: What design features are useful in a science blog aggregator?

First, the great news: Bora, Dave, and Anton got together and developed a website to aggregate science blog postings. It’s at . This is really still at it’s first stages and they plan to continue to add to it and refine it as they go.

Here’s a screenshot (I’m guessing the page will look different over time so this way you can see it as I see it).

It's got three columns, and the top five stories from each source. The title links to the source as do the story titles. There’s also a blogroll – an alphabetically arranged list of the sources.

The sources are a combination of blog networks like this one, Discover’s, Nature’s,, etc., and some news feeds. Some of the sources are in other languages (Brazilian Portuguese, German, Chinese, French).

It’s clear from the design (and the delighted reactions) that this is meant as a place to go to read a diverse collection of science posts – to get a sampling. It doesn’t link to independent blogs – except when they are aggregated by “All-geo”. It also doesn’t have any way to export the contents or really explore the contents besides browsing the titles on the front page. If you mouse over the article titles you do get a snippet of information.

What features could help the current setup?

  • some way to expand and read a snippet without mousing over. People with twitchy hands might not do well with that
  • some indication of the blog name where the article comes from
  • separately, a page providing information about each source – I know some of these, but I’m assuming a lot of people don’t
  • an opml file or some way to export the rss feeds to your reader (you could, of course, visit the original site or just keep coming back)
  • I’m not sure what order things are on the page. Maybe they should be in some categories? Some explicit organization? (Blake makes that comment here)
  • Blake also makes the comment that these various aggregators have different rates, so 5 posts might stay there for a while or there might be 5 posts an hour – it’s hard to see how to deal with that.

Could independent blogs be added, and how?

This post from Dave puts forth some ideas for adding “science blogs”. The first problem is defining what’s a science blog. I faced this in both of my previous studies, and I solved it two different ways. One I was very strict: self-identified scientists posting mostly on scientific topics. The other I was more broad – the above plus scientists posting on life in science, plus everyone else blogging about science.

What no one mentions on that post is: what is science? Are social sciences included? Librarianship? Areas of the humanities like anthropology, archaeology, communication, history? It’s really hard. Science librarians yes, others no? Well, then we’d lose Dorothea. So academic librarians? Then I’d drop off 🙂

First, selection and maintenance

  • Nature Blogs takes nominations and then requires two members to confirm. They require:
      1. composed mostly of original material - no press releases or lists of links
      2. primarily concerned with scientific research
      3. updated (on average) at least once a fortnight
  • Other suggestions – like from Jonathan Eisen on twitter – were to take nominations and have a curator say yes or no. This could be way, way too overwhelming and there could easily be hurt feelings if someone didn’t get included and they thought they should.
  • A variation on that is to have one or a few committees. Maybe for each subject area.
  • Maintenance is also an issue – keep dead blogs? Use an automated link checker? Manually go back and check if the person is still blogging and still blogging about science? How often? Have a way for visitors to report. (Oh and for heaven’s sake, Nature won’t let me change my url from blogspot – let the bloggers update their urls).

I sort of think the Nature way pretty much works. It’s crowd sourced, so less load. But the maintenance stuff needs to be added.

Second, organization

  • There needs to be some organization scheme. It might go deeper (with sub categories) in areas where there are a lot of bloggers
  • The organization scheme could have a couple of different facets (topical/subject – chemistry, gender, work setting – industry)
  • Should be able to look at an aggregation on each subject category, and export rss feeds from that category
  • Some of the others aggregate around what journal article or molecule is being discussed – this might be too hard and there might not be enough content to do that.
  • There could be some organization around links. See who links to this blog, see who has commented on this blog – but that would also take a lot of work.

Personally, I’m not so much interested in links to press releases and main stream media – the bloggers pick up things like that that are interesting (I pick up some from the information industry). I’ve already spent way to long on this for incremental help to the founders – they have already done an amazing job. Maybe some information architect-y or user experience person might weigh in?

5 responses so far

The danger of using only sources with recent coverage

… is well documented. Consider, for example, the tragic case of the JHU researcher who only searched Medline 1966 forward and so missed an association between the intervention in a study and lung toxicity that had been reported in earlier literature [1-2]. In biomedicine, there is a huge emphasis on recency – and for good reason, science moves fast. The cited half life (a measure of how far back citations generally go) is way shorter in medical fields than in say, math (which is always >10 years). The engineering databases that I use most frequently go back to 1898 and 1884. I also use Web of Science and Chem Abstracts which go back to the very early 1900s (~1908).

But anyhow, Biochembelle, re-tweeted by Scicurious, pointed to an editorial from Nature Reviews Microbiology [3] that says youngsters today aren’t getting the proper baseline literature because they’re relying on PubMed and Google Scholar. They cite the subject area of bacteriophage biology – developed well before the Medline era. Some researchers in this area have created their own bibliography of articles prior to PubMed, but they are concerned about losing access to the publications as they are moved out of the library to storage.

There are like a ton of things wrong with these statements. First, have they tried Biological Abstracts? As far as I can tell it goes back to at least 1917 (my parent institution has it stored because we have the online version, BIOSIS, and we have the backfile). Second, libraries typically don’t move journal runs off site unless they have the electronic equivalent or at least until they’ve shown that there’s very little if any usage. Many scholars wish more were moved off site – they get free scanning and electronic delivery on those articles instead of having to photocopy themselves! Libraries are also buying electronic backfiles – don’t assume that just because it’s old, we don’t have it online! In fact, some pre-1923 biology texts are freely available in the Biodiversity Heritage Library.

My points in a nutshell:

  • yes, it is very dangerous to rely on incomplete resources like GoogleScholar
  • yes, it is very dangerous to only use recent information
  • if you’re at a research institution, you don’t HAVE to rely on PubMed and GoogleScholar, you have access to other resources and it’s no one's fault but your own if you don’t ask your librarian what to use

[1] McLellan, F. (2001) 1966 and all that-when is a literature search done? Lancet 358(9282),646.doi:10.1016/S0140-6736(01)05826-3

[2] Ramsay, S.(2001) Johns Hopkins takes responsibility for volunteer's death. Lancet 358(9277), 213. doi:10.1016/S0140-6736(01)05449-6

[3] Raiders of the lost articles. Nature Reviews Microbiology 8, 610. doi:10.1038/nrmicro2435

12 responses so far

More questions about supplemental materials

Dorothea posted about this, too, and I posted earlier. Also an interesting comment from Claudia on friendfeed.   DrugMonkey's comment on my post and my re-reading of the editorial (readability helps and it appears to be freely available) brings up more questions than it answers. Specifically, I'm thinking that the disciplinary differences in what supplemental materials contain and how they're treated might be important.

Here are some questions:

  • What is in the supplemental material? Just data?  More calculations or derivations of equations? Multimedia (which will actually be moved into the text for the Journal of Neuroscience - a pdf with a video in it, security holes, anyone? preservation concerns anyone? maybe)
  • To what extent are these materials peer reviewed?
  • If they are peer reviewed, are the reviewers given separate criteria or are they to use the criteria set out for the text?
  • According to the comments and the editorial, reviewers required supplemental material (and additions thereto). Is that right? Typical? Good?

It seems like I've been considering the problem as if we were talking about data tables or calculations/derivations, and that these things weren't reviewed the same way.

Oh and other random points occuring to me now:

  • if you get the article via interlibrary loan, you don't get the supplemental materials, right?
  • if you get the article via aggregator, you don't get the supplemental materials, right?
  • what about videos - they won't come through Illiad - will the journal allow for some way to lend them? Will they come through in aggregators?
  • as Dorothea mentions, will this encourage authors to archive in their local repositories or to just chuck the files under the desk and lose them? (the editorial hopes for more disciplinary repositories - so that could be a net win)

3 responses so far

Supplemental materials or no?

I was surprised when I read this DrugMonkey post on J Neuroscience's ending supplemental materials. In fields without significant open data repositories with required deposit prior to publication, supplemental materials may be the only way to get the data to check the work or to build upon it (authors aren't very good at replying to requests to share data - studies show).

I really don't know anything about neuroscience, is this field different or is this coming in other fields?  I know that astro and optics journals have been expanding their ability to take supplemental materials such that they are preserved and accessible.  Here are some of the reasons found in Drug Monkey's post:

  • they were representing the data as peer reviewed, but it isn't reviewed to the extent the text is and what does peer review of data mean anyhow
  • there's an arms race among authors and reviewers to throw in everything but the kitchen sink proactively to not be criticized and to request that more data and more experiments be included in the supplement
  • the text should stand on its own merits

Anyway, I hadn't heard this view and I didn't know this is the way it was working in this field. I kind of thought the paper was reviewed on its own merits and the supplemental data was like a bonus track added later. Once again, I'm thinking of astro and optics journals. So is this view common or does it work differently in different fields?  (Is there a paper the view of supplemental data in diverse areas of science?)

2 responses so far

Delivering the results of a literature search

My primary job at work is to be the point person for in-depth literature searching. I don’t do all of it, but for science and technology needs I get first dibs and then I share out work that can be better done by another librarian (bio goes to A.C. who has a bio degree) or if I am too busy. In-depth literature searching is typically anywhere from 4-40 hours worth of work, pulling information from external information resources and arranging it so that it’s useful for the end user. Once I do a reference interview and then go back and forth to make sure I understand what’s needed, I do the searching, I analyze the results, and then I deliver the results of the search. This often results in an in-person meeting, but there’s always some text aspect.

I most often deliver my results in a word document. I start by encapsulating what they requested and providing a brief summary of the most salient points. In this summary I also mention if there are promising areas that turned up in the search or if there was a notable lack of information in an area. I then have a clickable list of headings which jump you down to citations that fit each heading. Sometimes there will be a discussion under a heading describing what’s going on in this part of the literature or anything interesting. I deliver the citations with an abstract and sometimes I’ll highlight things in the abstract. Recently, I’ve been including a section all the way at the bottom with search methods and resources used. A couple of times my work has been turned over to an external sponsor who was surprised/impressed that I found so much and has demanded to know how I did it. I track this stuff anyway (the scientist in me) so now I’m adding it more proactively to the report.  My boss is big on branding so I might go back to putting a logo at the top, but I’ve been leaving that off recently.

Other times I’ve added things to a wiki or SharePoint site, I’ve delivered a database of citations, I’ve created a spreadsheet of data, and I’ve delivered a kml file to be used in Google Earth. Sometimes I’ll just deliver the results in an e-mail, it just depends.

MaryEllen Bates talks at conferences about how to best package the results of your search. I highly recommend attending one of these sessions. I’m pretty sure she’s written this up,too, so check it out.

So what’s brought this up now is a ResearchBlogging overview of an article [*] on delivering results using 2.0 technologies.  I can’t cover the article better than Jacqueline does, so I’ll refer you to her blog post. I’ll offer here just some general thoughts.

  • access to the full text of identified articles – I’ve used RefWorks’ openurl output format to allow recipients to locate full text using our open url resolver, I’ve attached particularly relevant articles to the e-mail, I’ve provided direct links… but what happens most often is I’ll get a highlighted report back or a set of item numbers back and I’ll e-mail the pdfs.
  • the authors had problems with e-mails getting lost – I don’t know that that has happened, but sometimes my report won’t be viewed for a couple of weeks, and then I’ll hear back about it
  • they ruled out RefWorks because it required two sets of logins/passwords – hmm, why not RefWorks with RefShare? Why two sets of passwords?
  • SharePoint wikis suck. I would probably use some other type of web part – even a discussion board entry for each article.
  • they really didn’t use the 2.0 aspects of the 2.0 tools – particularly in the case of the wiki. The most valued aspects were access without a lot of logins and then access to the full text without a lot of clicks.

I would be interested in hearing other approaches – particularly using newer tools.

[*] Damani S, & Fulton S (2010). Collaborating and delivering literature search results to clinical teams using web 2.0 tools. Medical reference services quarterly, 29 (3), 207-17 PMID: 20677061

Comments are off for this post

Rundown of the new interfaces this summer

Aug 06 2010 Published by under [Information&Communication], libraries

I've been a librarian for a little bit, and I can't remember a time when so many interfaces changed in such a short period of time. I really feel for the academic librarians who have to update all of their training materials. I'm going to run down some here, and then add to it as I hear of more. Some of these are major (RefWorks and others are more cosmetic ChemNetBase)

Already done

  • PubMed - but that was a bit ago
  • CRCnetBase - what a kerfuffle, that was this spring but ChemNetBase was just this past week
  • IEEE Xplore
  • AccessEngineering
  • Embase
  • EbscoHost - this just happened today for my place of work
  • Royal Society of Chemistry journals
  • Sage journals (they were moving a few at a time, not sure if this is complete)
  • Economist Intelligence Unit (EIU)
  • added Human Kinetics (journal pages)
  • moved from coming Safari ebooks (they hope the "vast majority" of books will still be there after the re-org, uh-oh!)
  • moved from coming Books 24x7 (basically the same, new colors)
  • moved from coming SpringerLink
  • moved from coming Wiley Interscience > Wiley Online Library


  • Lexis Nexis Academic (cough - lipstick on a pig - cough), due any time now
  • Science Direct & Scopus > SciVerse, due August 28
  • RefWorks > RefWorks 2.0, due Fall 2010
  • EngineeringVillage (adding citing information to Compendex and Inspec from Scopus)
  • moved from future Faculty of 1000 > combined bio, medicine, & The Scientist, due October 1 (+/- 2 days)

Announced for the future

  • Web of Science, due early 2011
  • ProQuest, CSA Illumina > new ProQuest platform (this is a big, big deal)
  • ACM has a beta of their abstract page - not sure when this is coming

What am I missing?

Updated yet again 8/24 - totally missed JSTOR, but Meredith Farkas sheds a little light there.

One response so far

Disciplined tagging or how Stack Overflow plans to control their vocabulary

Carol H tweeted this blog post today from Stack Overflow, the wildly popular question and answer site for IT, CS, software dev, etc. Essentially, if you get stuck, you submit a question and you provide subject tags to help people find it. Answering questions gets you reputation points.

A collection of user-generated tags becomes a “folksonomy” (to use a worn out term), but typically in social software sites, the choice of the tag is completely up to the user so you get multiple versions of the same term (US, United States, USA, usa, U.S.A., etc), you have meta terms (to-do, to-read), and sometimes some unpleasant stuff. LIS researchers in information organization have done a ton of papers on these things and people who do taxonomies for a living sometimes use them to help determine “preferred” terms.

So according to this blog post. SO seeds new sites with a few sample terms and they started by letting everyone add new terms. Then they allowed moderators to merge terms. Then they required higher and higher reputation scores to be able to add new terms. But the terms were getting out of control. So this is cool, they now have wiki scope notes and synonyms for terms.

My CS colleague from work (hi Jack!) gives me a hard time – generically as a librarian – that I think all of the vocabularies should be determined in advance and human assigned, etc. He thinks these things should be emergent and machine assigned where possible. Obviously neither of us entirely subscribes to either of these views. If you have the luxury of the funding and time to have a good controlled vocabulary and human machine-aided indexing, your information system will be easier and better to search (better recall, better precision, more user satisfaction). However, it’s hardly ever the case that you have all of these things, and even if you do, user suggested terms are important to add to and maintain your CV.

Ok, one of my ongoing jokes is how CS keeps reinventing LIS (well indeed they’ve taken over the term “information science” in some places) – so now Stack Overflow has reinvented taxonomy (not quite a thesaurus though, right, because no BT or NT just UF and U, lol)

Edit 8/7: Promoting this from the comments. Joe Hourclé tells us that they've addressed some of the issues discussed here (I doubt they read this though 🙂 ) see:

4 responses so far

Older posts »