An ephemeral platform, used for other than ephemeral, and the death of Storify

(by Christina Pikas) Dec 15 2017

As I say in my dissertation and elsewhere, informal scholarly communication in social media is both ephemeral and archival. Maybe this is new because some online traces intended to be for a limited number of recipients for immediate use have longer life and wide reach. Some utterances in social media live on well after the originator intended (for good and bad). But maybe it's not entirely new as certainly letters among scientists have been preserved (some of these were no doubt sent specifically for preservation purposes).

I've long been a fan of blogs for personal knowledge management, that is, thinking through readings, partial results, tutorials for how to do things. Blogs are easily searched, archived, migrated, shared, and don't enforce an artificial or proprietary structure found in other tools. However, I also know that long-term bloggers who have established a readership through careful, well-edited posts impose new barriers on themselves for using their blogs for this purpose. I found in my studies that some superstar bloggers almost entirely stopped blogging because they didn't want to post anything incomplete or partial and there were too many other things to do.

I think this has been one of the motivating factors for the use of Twitter for long threads of stories and analysis. Twitter has great reach and immediacy, and interactivity... but at the expense of search (although it is certainly better than it was) and preservation. Who of us hasn't dug through our likes and RT to try to find something interesting we saw ages ago?

We're using a platform specifically built for ephemeral communication for communication that should be saved and preserved.

So individuals who value this knowledge management function, or who appreciate careful analysis or good storytelling serialized over 10s of tweets have adopted Storify to gather and order and preserve and contextualize the pieces. Storify added tools to make it a bit easier. Instead of Storify, you could embed individual tweets (this embedding function also calls back to Twitter so really doesn't preserve). You could <eek> screenshot. you could even just write it up and quote the text.

And Storify is going away this Spring. We do have notice, luckily, but we still have a problem. We need to back our stuff up - we need to back other people's stuff up. Not everything is of the same value to the originator as it is to someone else.

My plea - and it will go unheard - is to put things back into blogs which you then tweet. Or back your useful tweets up to a blog?

FWIW, I'm trying to capture AGU meeting tweets and I'll load them into FigShare ... but the odds of some researcher capturing and saving your stuff is actually quite slim.

This post was inspired by a tweet that has a thread and interesting points by her interlocutors :


No responses yet

More evidence for the discovery layer as pile of crap metaphor

(by Christina Pikas) Dec 04 2017

this Cambridge University Report (pdf) via Aaron Tay

page 16:

The key insight was the expectation from users that the simple iDiscover search function would automatically return a list of results as sophisticated and relevant as they would expect from other, more powerful search platforms. This led to frustration when, for example, a search for a journal title returned a number of articles and other results before the link to the journal holdings and links to online access. At this point, when asked what they would do next, many of our participants answered by saying that they would start using another search tool.


Some of the problems were a mismatch with the user's perception of the tool (as a catalog):

page 18

“Book reviews above books just don’t make sense!” (Archaeology PhD student)
“When looking for a book, you’ll end up with a random science article.” (English undergraduate student)
“If you search for a title that only has a few words in it, even if you type it in correctly, other less relevant titles will come up first.” (Education MEd student).”


page 22

When asked what was most important to them in terms of platforms used to search for information resources, the words ‘relevance’ and ‘relevant’ were used by a large number of our participants. This was directly linked to a desire for seamless, efficient searches which yielded appropriate and useful results, without the need to use pre- or post-search options to limit or refine them. People were often frustrated at the lack of percieved [sic] relevancy in the initial results list, after having used the main iDiscover search function

[lol, we had a vendor here to help us get our enterprise search going many moons ago... they said "relevance is dead!" I was like "nope!"]

No responses yet

No, vendor, we don't want a pile of crap actually

(by Christina Pikas) Dec 02 2017

Large Copper Dung Beetle (Kheper nigroaeneus) on top of its dung ball

Yes, I have posted about this a number of times, and no this will probably not be too different.   Our vendors have swept up the little competition and then redone their boutique databases to make them - generally - work like piles of crap.

So there are two massive 3rd party aggregators that sell massive piles of crap. Don't get me wrong, these are super attractive to libraries who can then say: look at all these titles we cover! Look at how much content we have! The problem is that with our current state of information abundance, with lots of big package deals, with more and more open access, and with informal scholarly sharing < cough >, getting the full text of recent articles from big name journals really isn't a thing.

The thing is efficient, precise, thorough, appropriate information at the right time and place. I say: I need exactly information on this thing! The aggregators go: here's a massive pile of crap!  I'm like, well I don't need a pile of crap, I need exactly this thing. System returns: here's another pile of crap!

Look at the Aerospace database, for example. Used to be the only real database that covered hypersonics and was thorough at all at covering AIAA and NASA technical reports. It was CSA when I got to know it. Compendex, in comparison, is just adding AIAA stuff this year and isn't going back to the 60s. CSA databases got sold to ProQuest. I have no idea what the hell they've done with it because every time I do a search I end up with trade pubs and press releases - even when I go through the facets to try to get rid of them.

CSA used to have a computer science database, too. The current computer collection in ProQuest doesn't even allow affiliation searching. Also, a search I did there yesterday - for a fairly large topic - didn't return *any* conference papers. For CS. Really.

This is not to pick on PQ, ok maybe it is, but their competitors really aren't any better.


At the same time, we keep having people tell us at my larger organization, that we *must* get/have a discovery layer. Let me just tell you again, that we did a lot of testing, and they did not provide us *any* value over the no additional cost search of a 3rd party aggregator. They are super expensive, and really just give you - guess what - all your stuff in a huge pile of crap. I hear nothing but complaints from my colleagues who have to deal with these. The supposition was that we wanted a Google interface. Ok, maybe a sensible quick search is fine, but that only works when you, like Google, have extremely sophisticated information retrieval engines under the hood. Saying - hey we cover the same journals as your fancy well-indexed database but without the pesky indexing and also lumped together with things like newspapers, press releases, and trade pubs... is not really effective. It's a pile of crap.

You may say, "But think of the children!" The poor freshman dears who can't search to save their lives and who just need 3-5 random articles after they've already written their paper just to fill in their bibliography due in the morning....

Is that really who and what we're supporting? Should we rather train them in scholarly research and how to get the best information? And anyway, for my larger institution, we hardly have any freshmen at all.

No, vendors, we do not want a large pile of crap, but thanks for offering!

2 responses so far

Welcoming Confessions of a Science Librarian to Scientopia!

(by Christina Pikas) Nov 28 2017

I'm pleased to point to John's new home here: 

His first post rounding up best science books is live already. We'll get him linked from the home page - but check it out!

2 responses so far

Providing real, useful intellectual access to reference materials from current library pages

(by Christina Pikas) Nov 13 2017

Those of use who study or teach about scientific information have this model of how it goes:

(this image is lifted from my dissertation, fwiw, and it more or less reproduces Garvey & Griffith, 1967; Garvey & Griffith, 1972)

Conference papers (in many fields) are supposed to be more cutting edge - really understandable to people in the field with a deep understanding but who need that icing on the cake of what's new. Journal articles are for more or less after substantive parts of the work are complete and take a while for review and publication (letters journals are supposed to be much faster), and then monographs and textbooks are more for when the information is more stable. More recently, there's a category of shorter books that are sort of like extended reviews but are faster than monographs. Morgan & Claypool, Foundations and Trends, and the new series coming from Cambridge University Press (no endorsement here) are examples. (Note the model omits things like protocols, videos, and datasets).

Reference books are even slower moving. They are used to look up fairly stable information. Here are some examples:

  • encyclopedias (and not just Worldbook, but Kirk-Othmer, Ullman's, and technical encylopedias)
  • dictionaries
  • handbooks (not just for engineers!)
  • directories
  • gazetteers (well, maybe less so for the sciences), maps
  • guidebooks (like in geology, biology)
  • sometimes things like catalogs...

You may think, hey, all I really need are the journal articles and Google and maybe Wikipedia. Or at least publishers and librarians think you're thinking that. And reference books are sort of disappearing. It doesn't make any sense to devote precious real estate to the print versions and the online versions are super expensive and also often not used.

The thing is that these tools are really still needed and they have condensed very useful information down into small(er) packages. If you're concerned about efficiency and authority then starting with a reference book is probably a good idea if you want an overview or to look up a detail.

The publishers don't want to lose our money so they're taking a few different approaches. Some are making large topical digital libraries that combine journal articles, book chapters, and reference materials. This can be really good - you can look up information on a topic when you're reading a journal article or look up a definition, etc. You can start with an overview from an encyclopedia and then dive deeper to learn what's new. The problem from a librarian and user point of view is that the best information may come from multiple different publishers and you just won't get that. You won't get a recommendation for someone else's product.

Another thing publishers are doing is to make reference materials more dynamic. First, they can charge you more and and more frequently. Second, even if the updates are quite small, it makes the resource more attractive to potential users to have a recent date updated. One publisher in particular has commissioned sort of a portal approach that gathers materials from various places and has commissioned new overviews.

There's a tool to sort of search across more traditional reference materials, but... meh.

Of course if you have a well-developed model of what type of reference tool will have your needed information, then you can use the catalog (subjects like engineering - handbooks, engineering - encyclopedias). Back in the day, I wrote about how senior engineers gathered and created their own handbooks from pieces they'd found useful over time.

So here's where librarians come in. I've never taught the basic undergrad science welcome-to-the-library class (I attended one <cough> years ago), so I really don't know if they go over these distinctions or not. So that leaves our guides to try to get people to the best source of information. Guides that are merely laundry lists of tools by format/type are frowned upon because they are generally not useful. That's what we used to do though: here's a list of dictionaries, here's a list of encyclopedias... etc. What we try to do more now is make them problem based. Somewhat easier in like business: need to understand an industry? need to look up a company? Also maybe in materials science and or chemistry (although SciFinder and Reaxys' way of doing properties may be supplanting).

Ok, so beyond the difficulty of expressing the value of each of these tools and in which situations they are useful, we have the affordances of our websites and the tools that produce them. Most are database driven now, which makes sense because you don't want to have to go a million places to update a url. Except... one reference might be useful for one purpose in one guide, and another in another, and then how do you get that to display? How do you balance chatty to educate when needed verses quick links for when not?

Also, do you list a digital library collection of handbooks or, more commonly, monographs mixed with handbooks, as a database? As what?

The reviews and overviews and encyclopedias... do you call them out separately? By series?

Users sometimes happen upon reference books from web searches - but that's mostly things like encyclopedias. If they need an equation or a property... well, if they're an engineer they probably know exactly what handbook... so then, I guess, if they don't have their own copy, they would use the catalog and get to the current edition which we may have online. Getting a phase diagram or other property of a material - I'm guessing users would probably start online but for some materials we have entire references (like titanium, aluminum... and then things like hydrazine).

I'm thinking we could have on an engineering guide, a feed from the catalog with engineering - handbooks? Likewise a feed physics-handbooks?  What about things like encyclopedia of optics. Call out "major reference works" and then catalog feed of [subject] - handbooks|encyclopedias|etc....

OR.. hey... what about the shelf display model:

But, instead of all books, just the books for that guide that match [guide name] -- encyclopedia|dictionary|handbook, etc.

What other methods can we use?

No responses yet

#AcWriMo: Late start

(by Christina Pikas) Nov 04 2017

I'm going to try to get re-invigorated with writing this month. For me, it's of course not academic writing but scholarly writing.

Here's what I'm working on - and this is all outside of work stuff:

  • dissertation > first journal article describing the framework and testing it
  • dissertation > second journal article or maybe conference article looking specifically at longitudinal changes/evolution of use of twitter at science conferences (also want to eventually update and publish a longitudinal look at the continuing role of blogs, and of scholarly informal communication in general)
  • short article on using k-means longitudinal clustering for citation trajectory modeling (this I really want to do - more motivated by this than the others!)
  • article based on the work that generated the METRICS17 poster - bibliometric institutional profiles.

Goal: work every day, and at least 2 hours on Saturdays. Really, just get writing!

Yes, I should pick one of the above articles instead of doing all these at once. Going to try to meet up with advisor around turkey day so there's a short goal - we're working only on the first one so that *should* be a priority even if less fun.

No responses yet

ASIST2017: Information Use Papers

(by Christina Pikas) Nov 01 2017

Ma Cui-Chang and CaoShu-Jin - Identifying structural genre conventions across academic web document for information use

Swales model for research articles

Move 1 Establishing a territory
Move 2 Establishing a niche
Move 3 Occupying the niche

rhetorical organization patterns - disciplines, different information uses

sources for development: rhetorical objectives of the genres > linguistic clues > move analysis, writing rules genre research

academic blog post, online encyclopedia, research articles

corpus - 81 documents, 2015, Chinese documents with kw "citation analysis"

raters - interrater reliability 80-100%

Taxonomy identified and validated

q: how will you use this? will you use machine or automated clustering based on this.

q: can you elaborate on information units you found on the web or in web documents vs. formal publications.

a: main difference in how organized. also Swales is developed from written English articles.

q: mentioned Swales was developed to help train junior users, could your taxonomy help further with teaching


Devendra Dilip Potnis (speaker), Kanchan Deosthali, Janine Pino - Investigating barriers to using information in electronic resources: a study with e-book users

Motivation: spend money on electronic resources, but they're underused. Goal: to investicate barriers to using information in ebooks

Key findings - 60 barriers. Categories:

  • ereaders (16)
  • features of ebooks (20)
  • psychological (7), somatic(3), cognitive status (6)
  • cost
  • policies

different actors - things about the users and things about the environment, system, vendors

uses Wilson (2000)'s definition of using information - both physically accessing, as well as mental schemas and emotional responses

4 broad stages of information use- searching, managing, processing, applying information

Lots of previous studies - their main difference is how they look at use of information instead of "value".

They did a survey of LIS students (n=25) [sigh... this is a real and important topic, but sigh]

These participants also might have more insight into use of information, what's going on in libraries, etc.

Great quotes - flipping pages waiting for a page to load - breaks concentration. Not immersive. Policies don't let download. Poor text quality.

Mapped barriers to information use stages.  For example psychological barriers prevent information processing. Technical barriers prevent use of information

"due to a series of unavoidable barriers, respondents who originally intended to use ebooks for utilitarian purposes end up using this electronic resource mostly for hedonistic reasons " (pleasure reading, but not reference)

contributions - insight into adoption, why a negative perception. also if hiring a new librarian, will they have a negative attitude toward ebooks.

q: plans to go bigger with this

a: not really - so disheartening [welcome to my world] - but is planning a bigger hci study

q/comment: need to really differentiate between scholarly and leisure reading and even within scholarly, engaged with as monograph vs no drm pdf per chapter engaged on a per-chapter basis almost as a journal article

q/c: some have advanced annotation and highlighting features of which users may be unaware

Ayoung Yoon - Role of Communication in Data Reuse

Secondary use of data - not for the original purpose, and generally not by original collector of data

not a simple one-step process, transfer of knowledge, "social process" interactions and communications with other relevant parties (Martin, 2014)

who are involved, why and how)

past studies - transferring information about context of data, difficult to know what contextual information is important for unknown possible reusers, level of skills and tacit knowledge of reuser

strategies - documentation (inherently insufficient, not everything can be transferred), communication with producers (formal or informal)

38 - quantitative data reusers in social work and public health. Identified from scholarly databases using "secondary data" or "secondary analysis"

not a linear process - discovery, selecting, understanding, analyses, manuscripts

purpose of interaction communication - searching, interacting, problem solving

search is complicated - no one place to look, data may be dated, rely on established network, have a "data talk"

interaction/communication - learning process, collaboration and mentoring process, "not just access to the data but more importantly, access to people", "how to get around challenges"

problem solving - "knowing other people who were closely working with the data" "talking among ourselves" give reusers "confidence" about solving issues. Also working with data professionals and statisticians "if the problem was really me or the data"

Limitation of communication around data - have to be part of the network to have information needed to access data - peripheral and junior researchers. Unsuccessful interaction with data producers (no answers, partial answers, busy, contact person may be project manager and many not know)

communication is not always necessary for reusers - if it's well documented, known, and the reuser is experienced.

important to support this communication around data - most libraries do not deal with this but deal with mandates and sharing.

q: communication around data among reusers - not with producers - role for platform to support?

a: extended (great) - she did see that in her work. lots of discussion at conferences and within networks among reusers. OTOH, some participants hit a wall when they didn't get a response from producer and didn't have anyone else to ask next. Library is not seen in facilitating this but would be helpful if they could. Platform facilitating could be useful, too.


No responses yet

ASIST2017: Top Ranked Papers

(by Christina Pikas) Oct 31 2017

Re(a)d Wedding: A Case Study Exploring Everyday Information Behaviors of the Transmedia Fan
Eric Forcier

Information behavior lens to study how transmedia fans negotiate... . Fans  - production and consumption. De Certeau - everyday life practice + Floridi - infosphere, "information is our environment" = postdigital everyday life practice

Fandom "regular, emotionally involved consumption of a given popular narrative" Sandvoss 2007

transmedia - media-hopping network of intertextualities.

Studied Game of Thrones - Red Wedding

engagement - Nahl 2007

para-active engagement Evans 2016 (like para text)

updating the "iceberg theory" Hemmingway

hyperdiegsis "creation of a vast and detailed narrative space only a fraction of which is ever directly seen or encountered within the text (Hills 2002)

postdigital reading

takeaway - everyday information behavior model of transmedia fan

Before Information Literacy [or, Who Am I, as a Subject-Of-(Information)-Need?]
Ron Day

Lots yesterday about fake news - his is on that topic.


Affordances and Constraints in the Online Identity Work of LGBTQ+ Individuals
Vanessa Kitzie

Technologies enable and constrain - profiles on social media, search results - important for identity work.

No utopian view - online reflects structural disadvantages of

internet isn't necessarily emancipatory for LGBTQ+

sociomateriality - theorizes imbrication of technologies and users

affordance - example - OKCupid's body type selections

30 interviews with LGBTQ+ individuals. Critical incident. Emic/etic coding miles&huberman 1994. Pseudonyms and preferred pronouns

identity expression - ways to play with that they can't do offline. Example: Jamie  - viewed as female offline, but presents as male online, images, profile details "catfishing". Can't do offline. Even online can be seen as deceptive. Authenticity is required by things like "real name policies" but

visibility - natural language queries and search box aesthetic. Lacked language to express identity got some results to help. OTOH - results often stigmatize. Difficult to get a search results set without someone being shot, etc. (do not want these reports hidden, but just ...) Because ranking depends in part on popularity, reflects larger society - the search engine has agency and does things unintended by engineers

anonymity - desired due to heternormative contexts. Craigslist vs OKCupid - scary to post picture with gay identity but craigslist has its own scary. 4  c h a n board - anon but use special codes in names to create pseudonyms and trust.

Implications - design for stress cases.

Limitations - qualitative, assumptions about access to technology.


Grouped questions:

for Eric:

how common are these transmedia fans?

a: transmedia fans vs. super fans - different.


No responses yet

ASIST2017: Technology as Humanism (plenary)

(by Christina Pikas) Oct 31 2017

Technology as Humanism: Rebooting the Digital Revolution

William Powers, MIT Media Lab

Border of science/humanities is where innovation happens and that's where he sees us.

Library - he thinks of accessibility, order, quiet, control

Researched the "death" of the book.

Shakespeare - Hamlet - "tables" - like dry erase notebook. Lots of discussion at the time about information overload with all the books newly available. "Hamlet's Blackberry: Why Paper is Eternal Shorenstein Center 2006 led to book Building a good life in the digital age. Digital maximalism - more digitally connected you are, the better.

Philosophers of Screens (7 dead white males)

  • Socrates - the alphabet will rot your mind! if it's fixed on the page
  • Seneca - "restless energy of a hunted mind" Internal mental exercises
  • Gutenberg - sort of relic-looking selfie sticks with mirrors so pilgrims in crowds could experience the relics
  • Shakespeare - "this distracted globe" (speaking of his own head and those of the audience)
  • Franklin - "all new tools require some practice before we can become expert in the use of them"
  • Thoreau - failed inner life reflected in too much mail (like if you're doing it right, in his opinion, you don't need to talk to anyone else if you have a good inner life)
  • McLuhan - "how are we to get out of the maelstrom create your own ingenuity"

Internet sabbath - useful

Books since his - Carr Shallows, Schulte Overwhelmed, Newport Deep Work, Turkle Conversation....

Digital detox.

Turns out print books are doing find and ebook sales slip. Millenials like paper diaries and notebooks.

Leo Marx - Machine in the garden - "mechanistic habits of the mind"

Working now with people interested in bias in algorithms and in getting stories

Q: social machines, really?

A: not robots, but humans and machines and aspects of humans with machines

.. see great tweets from Kaitlin Costello


also great question - you're saying some utopia or ideal, but before there were a lot of have-nots and disadvantaged...



No responses yet

ASIST2017: Making A Case for Open Research: Implications for Reproducibility and Transparency

(by Christina Pikas) Oct 31 2017

Came in super late (darn traffic - left home, 30 mi away, 2 hours before getting here)

Caught the end of Erik Mitchell and Edward M. Corrado - they did a survey of JASIST authors and the responses were bleak. Suprising only like 25% or so had an IRB? Few shared data or had data management plans. Few shared code. Few really did that much about open access.

John M. Budd: A retraction walks into the bar. Bartender says: what will you have? Retraction says: Nevermind. and doesn't leave the bar.

Retractions - lots. And lots of things that had been cited, the citations were substantive. Marking of retractions is poor. Work to be done and presented next year

Audience discussion:

Q: Is anxiety an issue? some researchers have been attacked for sharing data.

A: Well in qualitative, it isn't appropriate to really talk about reproducible

A: We didn't see this anxiety in our work, but maybe a qualitative study would

Q: Question about the result that said lack of consent was a reason not to share. Audience member was member of a project that had to go through specific consent forms to see if data could be used for new protocols.

(fwiw, I did actually reveal names of my participants in my dissertation research but I went back and re-asked consent giving examples of how it would be done)

Q: IRBs or research sites requiring destruction of data

Q: Works at a DOE national lab -and they have strict requirements for DMPs. Isn't that going to be more the norm now with funder requirements

A: Not evenly held accountable. Different agencies coming online at different points.

A: Someone from DOT - we're just now having funding calls that have this requirement. There are new requirements for PII data and DMP. They are part of the compliance chain at the National Transportation Library. They haven't gotten any data back in for it yet. It will be that you will be ineligible for future funding if you do not provide identifiers (this might be part of one contract broken into blocks). Many if not most large funders of science - Gates, Wellcome, other funders requiring.

Q: if I do a qualitative study of how people like riding buses, would the interview transcripts be deposited and available?

A: Yes - but probably some sort of de-identification, compilation, anonymization, etc. (I added this part).


No responses yet

Older posts »