Archive for the 'Conferences' category

ASIST2017: Information Use Papers

Ma Cui-Chang and CaoShu-Jin - Identifying structural genre conventions across academic web document for information use

Swales model for research articles

Move 1 Establishing a territory
Move 2 Establishing a niche
Move 3 Occupying the niche

rhetorical organization patterns - disciplines, different information uses

sources for development: rhetorical objectives of the genres > linguistic clues > move analysis, writing rules genre research

academic blog post, online encyclopedia, research articles

corpus - 81 documents, 2015, Chinese documents with kw "citation analysis"

raters - interrater reliability 80-100%

Taxonomy identified and validated

q: how will you use this? will you use machine or automated clustering based on this.

q: can you elaborate on information units you found on the web or in web documents vs. formal publications.

a: main difference in how organized. also Swales is developed from written English articles.

q: mentioned Swales was developed to help train junior users, could your taxonomy help further with teaching


Devendra Dilip Potnis (speaker), Kanchan Deosthali, Janine Pino - Investigating barriers to using information in electronic resources: a study with e-book users

Motivation: spend money on electronic resources, but they're underused. Goal: to investicate barriers to using information in ebooks

Key findings - 60 barriers. Categories:

  • ereaders (16)
  • features of ebooks (20)
  • psychological (7), somatic(3), cognitive status (6)
  • cost
  • policies

different actors - things about the users and things about the environment, system, vendors

uses Wilson (2000)'s definition of using information - both physically accessing, as well as mental schemas and emotional responses

4 broad stages of information use- searching, managing, processing, applying information

Lots of previous studies - their main difference is how they look at use of information instead of "value".

They did a survey of LIS students (n=25) [sigh... this is a real and important topic, but sigh]

These participants also might have more insight into use of information, what's going on in libraries, etc.

Great quotes - flipping pages waiting for a page to load - breaks concentration. Not immersive. Policies don't let download. Poor text quality.

Mapped barriers to information use stages.  For example psychological barriers prevent information processing. Technical barriers prevent use of information

"due to a series of unavoidable barriers, respondents who originally intended to use ebooks for utilitarian purposes end up using this electronic resource mostly for hedonistic reasons " (pleasure reading, but not reference)

contributions - insight into adoption, why a negative perception. also if hiring a new librarian, will they have a negative attitude toward ebooks.

q: plans to go bigger with this

a: not really - so disheartening [welcome to my world] - but is planning a bigger hci study

q/comment: need to really differentiate between scholarly and leisure reading and even within scholarly, engaged with as monograph vs no drm pdf per chapter engaged on a per-chapter basis almost as a journal article

q/c: some have advanced annotation and highlighting features of which users may be unaware

Ayoung Yoon - Role of Communication in Data Reuse

Secondary use of data - not for the original purpose, and generally not by original collector of data

not a simple one-step process, transfer of knowledge, "social process" interactions and communications with other relevant parties (Martin, 2014)

who are involved, why and how)

past studies - transferring information about context of data, difficult to know what contextual information is important for unknown possible reusers, level of skills and tacit knowledge of reuser

strategies - documentation (inherently insufficient, not everything can be transferred), communication with producers (formal or informal)

38 - quantitative data reusers in social work and public health. Identified from scholarly databases using "secondary data" or "secondary analysis"

not a linear process - discovery, selecting, understanding, analyses, manuscripts

purpose of interaction communication - searching, interacting, problem solving

search is complicated - no one place to look, data may be dated, rely on established network, have a "data talk"

interaction/communication - learning process, collaboration and mentoring process, "not just access to the data but more importantly, access to people", "how to get around challenges"

problem solving - "knowing other people who were closely working with the data" "talking among ourselves" give reusers "confidence" about solving issues. Also working with data professionals and statisticians "if the problem was really me or the data"

Limitation of communication around data - have to be part of the network to have information needed to access data - peripheral and junior researchers. Unsuccessful interaction with data producers (no answers, partial answers, busy, contact person may be project manager and many not know)

communication is not always necessary for reusers - if it's well documented, known, and the reuser is experienced.

important to support this communication around data - most libraries do not deal with this but deal with mandates and sharing.

q: communication around data among reusers - not with producers - role for platform to support?

a: extended (great) - she did see that in her work. lots of discussion at conferences and within networks among reusers. OTOH, some participants hit a wall when they didn't get a response from producer and didn't have anyone else to ask next. Library is not seen in facilitating this but would be helpful if they could. Platform facilitating could be useful, too.


Comments are off for this post

ASIST2017: Top Ranked Papers

Oct 31 2017 Published by under Conferences

Re(a)d Wedding: A Case Study Exploring Everyday Information Behaviors of the Transmedia Fan
Eric Forcier

Information behavior lens to study how transmedia fans negotiate... . Fans  - production and consumption. De Certeau - everyday life practice + Floridi - infosphere, "information is our environment" = postdigital everyday life practice

Fandom "regular, emotionally involved consumption of a given popular narrative" Sandvoss 2007

transmedia - media-hopping network of intertextualities.

Studied Game of Thrones - Red Wedding

engagement - Nahl 2007

para-active engagement Evans 2016 (like para text)

updating the "iceberg theory" Hemmingway

hyperdiegsis "creation of a vast and detailed narrative space only a fraction of which is ever directly seen or encountered within the text (Hills 2002)

postdigital reading

takeaway - everyday information behavior model of transmedia fan

Before Information Literacy [or, Who Am I, as a Subject-Of-(Information)-Need?]
Ron Day

Lots yesterday about fake news - his is on that topic.


Affordances and Constraints in the Online Identity Work of LGBTQ+ Individuals
Vanessa Kitzie

Technologies enable and constrain - profiles on social media, search results - important for identity work.

No utopian view - online reflects structural disadvantages of

internet isn't necessarily emancipatory for LGBTQ+

sociomateriality - theorizes imbrication of technologies and users

affordance - example - OKCupid's body type selections

30 interviews with LGBTQ+ individuals. Critical incident. Emic/etic coding miles&huberman 1994. Pseudonyms and preferred pronouns

identity expression - ways to play with that they can't do offline. Example: Jamie  - viewed as female offline, but presents as male online, images, profile details "catfishing". Can't do offline. Even online can be seen as deceptive. Authenticity is required by things like "real name policies" but

visibility - natural language queries and search box aesthetic. Lacked language to express identity got some results to help. OTOH - results often stigmatize. Difficult to get a search results set without someone being shot, etc. (do not want these reports hidden, but just ...) Because ranking depends in part on popularity, reflects larger society - the search engine has agency and does things unintended by engineers

anonymity - desired due to heternormative contexts. Craigslist vs OKCupid - scary to post picture with gay identity but craigslist has its own scary. 4  c h a n board - anon but use special codes in names to create pseudonyms and trust.

Implications - design for stress cases.

Limitations - qualitative, assumptions about access to technology.


Grouped questions:

for Eric:

how common are these transmedia fans?

a: transmedia fans vs. super fans - different.


Comments are off for this post

ASIST2017: Technology as Humanism (plenary)

Oct 31 2017 Published by under Conferences

Technology as Humanism: Rebooting the Digital Revolution

William Powers, MIT Media Lab

Border of science/humanities is where innovation happens and that's where he sees us.

Library - he thinks of accessibility, order, quiet, control

Researched the "death" of the book.

Shakespeare - Hamlet - "tables" - like dry erase notebook. Lots of discussion at the time about information overload with all the books newly available. "Hamlet's Blackberry: Why Paper is Eternal Shorenstein Center 2006 led to book Building a good life in the digital age. Digital maximalism - more digitally connected you are, the better.

Philosophers of Screens (7 dead white males)

  • Socrates - the alphabet will rot your mind! if it's fixed on the page
  • Seneca - "restless energy of a hunted mind" Internal mental exercises
  • Gutenberg - sort of relic-looking selfie sticks with mirrors so pilgrims in crowds could experience the relics
  • Shakespeare - "this distracted globe" (speaking of his own head and those of the audience)
  • Franklin - "all new tools require some practice before we can become expert in the use of them"
  • Thoreau - failed inner life reflected in too much mail (like if you're doing it right, in his opinion, you don't need to talk to anyone else if you have a good inner life)
  • McLuhan - "how are we to get out of the maelstrom create your own ingenuity"

Internet sabbath - useful

Books since his - Carr Shallows, Schulte Overwhelmed, Newport Deep Work, Turkle Conversation....

Digital detox.

Turns out print books are doing find and ebook sales slip. Millenials like paper diaries and notebooks.

Leo Marx - Machine in the garden - "mechanistic habits of the mind"

Working now with people interested in bias in algorithms and in getting stories

Q: social machines, really?

A: not robots, but humans and machines and aspects of humans with machines

.. see great tweets from Kaitlin Costello


also great question - you're saying some utopia or ideal, but before there were a lot of have-nots and disadvantaged...



Comments are off for this post

ASIST2017: Making A Case for Open Research: Implications for Reproducibility and Transparency

Came in super late (darn traffic - left home, 30 mi away, 2 hours before getting here)

Caught the end of Erik Mitchell and Edward M. Corrado - they did a survey of JASIST authors and the responses were bleak. Suprising only like 25% or so had an IRB? Few shared data or had data management plans. Few shared code. Few really did that much about open access.

John M. Budd: A retraction walks into the bar. Bartender says: what will you have? Retraction says: Nevermind. and doesn't leave the bar.

Retractions - lots. And lots of things that had been cited, the citations were substantive. Marking of retractions is poor. Work to be done and presented next year

Audience discussion:

Q: Is anxiety an issue? some researchers have been attacked for sharing data.

A: Well in qualitative, it isn't appropriate to really talk about reproducible

A: We didn't see this anxiety in our work, but maybe a qualitative study would

Q: Question about the result that said lack of consent was a reason not to share. Audience member was member of a project that had to go through specific consent forms to see if data could be used for new protocols.

(fwiw, I did actually reveal names of my participants in my dissertation research but I went back and re-asked consent giving examples of how it would be done)

Q: IRBs or research sites requiring destruction of data

Q: Works at a DOE national lab -and they have strict requirements for DMPs. Isn't that going to be more the norm now with funder requirements

A: Not evenly held accountable. Different agencies coming online at different points.

A: Someone from DOT - we're just now having funding calls that have this requirement. There are new requirements for PII data and DMP. They are part of the compliance chain at the National Transportation Library. They haven't gotten any data back in for it yet. It will be that you will be ineligible for future funding if you do not provide identifiers (this might be part of one contract broken into blocks). Many if not most large funders of science - Gates, Wellcome, other funders requiring.

Q: if I do a qualitative study of how people like riding buses, would the interview transcripts be deposited and available?

A: Yes - but probably some sort of de-identification, compilation, anonymization, etc. (I added this part).


Comments are off for this post

ASIST2017: Social Media Papers session

Oct 30 2017 Published by under Conferences, Information Science

Fei Shu (speaker) & Stefanie Haustein - On the Citation Advantage of Tweeted Papers at the Journal Level

Previous research - twitter exposure leads to an overall increase of citations. Correlation is weak. Low social media impact in countries where Twitter, for example, is limited or blocked.

Research questions - compare normalized citation rate of articles shared on twitter with similar papers from the same year. 22% of WoS papers are tweeted? (talking fast!) This causes problems - so look at journal level, control for journal, discipline, country of origin author. Data Web of Science and . Use the DOI to search both. In Altmetric can see where the tweets originate. They used thresholds to deal with outliers. Used tweets and citations from 2012 to 2015. Since there were some papers with very few tweeted papers, these would be difficult to compare. Used journals with at least 10 tweeted and 10 non-tweeted papers. ... also did threshold with 100 and 100 - in this case of 308 journals, 36% papers tweeted. Tweeted papers receive 68.4% more citations on average than non-tweeted (not corrected). Corrected by journal 30% citation advantage (significant at p<0.05). By discipline - varies - significant in 9 disciplines - not significant in chem, engr, human, math due to sample size. Source countries (based on author institution) - threshold level. Country with top tweeted - Netherlands. Sweden 91% citation advantage.

Citation advantage 30%, in all disciplines but extent varies.

Most tweets are from 6 months after publication

Chris Hubbles (speaker), David W. McDonald, & Jin Ha Lee - F#%@ That Noise: SoundCloud As (A-)Social Media?

SoundCloud is used to share and communicate about music. Has timestamped commenting and allows social interaction among fans woven into the playback feature. Used to distribute music, podcasts, and even some government organizations. "Social Multimedia" . Qualitative content analysis on these comments. Used search API (ID popular tracks) and then track API to pull all the comments for these tracks. Whole year of 2013. 100-200 tracks per day uploaded except for a weird spike. They removed from the sample spoken word. Kept 0-10 miutes, 10-500 comments. Collaboratively coded by authors. Codebook with 39 codes.  58 songs, 5,608 comments. 69% electronic music and hip-hop. Music was uploaded by artists, labels, promotion companies, fans, etc.  Comments were mostly positive. Were full of profanity, caps, emoji, exclamation points. But also about features of the music, and stories of where the music was heard and what it meant. Few of the comments were part of conversation threads. One track had 77 comments with no replies. Uploader replies were almost as common as fan replies.

Similar to what Dana Rotman found with YouTube. The presence of affordances doesn't mean will form community.

The display could be better to support participation.

"A-social party" - expression and not interaction. Broadcasting, graffiti, co-presence, mutually shared experience.

Quan Zhou, Chei Sian Lee (Speaker), & Sei-Ching Joanna Sin - Using Social Media in Formal Learning: Investigating Learning Strategies and Satisfaction

Self-regulated learning (Pintrich, 2000, p453) - "an active constructive process whereby learners set goals ... then monitor regulate... constrained by ...goals... environment". forethought, performance control, self-reflection

survey  - undergrad and grad students, if they used social media for any class, standard scales for learning strategies and satisfaction... n=270

PCA and regression. all 4 learning strategies significant. Goal setting most influential predictor of learning satisfaction. Self-evaluation second (social comparison - is a motivating force, unlike general studies of social media where comparison makes you unhappy). Keep in mind, their students are maybe more highly motivated than some other samples.

Limitations - didn't look at whether use was voluntary or mandatory. one university

q: how did you define social media? big list

q: did you ask for how the social media were used in the class? (no, not really?)

Comments are off for this post

ASIST 2017: Digital Literacy in the Era of Fake News: Key Roles for Information Professionals

Oct 30 2017 Published by under Conferences, Information Science

They were having problems with the projector so started with Connaway going through studies they've done related to information literacy. Important to provost and universities - learning doesn't stop when students graduate. How do we get students to use public libraries and use information in every day life decisionmaking.

  • How do people who work with the public in libraries get updated on information literacy
  • What do students know about how search engines work
  • How do people assess information on the web and in social media

Heidi Julien - engage in issues and model approaches

  • social media campaign about facts
  • express views publicly and stand up to confront misinformation
  • educate representatives at all levels of government - these issues are important and institutions like library need to be supported
  • advocate for importance of digital literacy.
  • Aldous Huxley "facts do not cease to exist because they are ignored"
  • (other international infographics and things to share)

Seadle - information professionals provide context and nuanced view.


Alex Kasprak - Science Writer at

Some things they're seeing are more like overblown - Yellowstone volcano may erupt sooner... > we're all gonna die!

More things like autism/vaccination.

Deeper expose on retired scientist who is peddling snake oil cure for cancer.

"50 studies say... " - he's never found one of these in which the studies do support the claim

Recent from B saying 400 articles saying climate change a hoax. Kasprak asked author "how long did it take to prepare" and Delingpole said "as little time as possible" (can share this because Delingpole posted that an "impertinent pup" from Snopes was fact-checking him with this comment).

questions: debunking - is it really useful or is it just giving more attention? Snopes won't necessary solve the problem but serve as a reference and affect the financial viability of these sites. Real world implications when Snopes debunks.

is it really about believing things that are untrue or is it more taking away debate - yes

is the term "fake news" too charged now to use. yes - probably not a useful term

other terms: lazy journalism, hucksterism, pseudoscience, etc. use more precise term

more blame on producers - but can we increase the cost of being wrong (reposting these stories)

Habermas-ian - public sphere as a place for exchange of rational ideas - but with Foucalt hat on if our problem is trying to maintain this notion of civil society in the face of people who are no longer interested in the ideal. The rational argument against an emotional or financial gain... beat our head against the wall.

Seadle response - like both schools of thought. but people aren't rational. behavioral economics. pure number of hits on a website gets you more money. Incentive structures to bring people back to

Julien - we are beating our heads against the wall, multiple cognitive biases, all operating in our own echo chambers - ideal

My q: influence operations by state actors vs. this

Kasprak - the state actors were taking messages already existing or making new messages modeled on existing, and then amplifying, paying to target these. So combating is actually similar, but we're not winning, and there are higher numbers.



Comments are off for this post

Poster for METRICS2017: Methods for Bibliometric Institutional Profiles for the Practitioner

Oct 29 2017 Published by under bibliometrics, Conferences

The poster:

I don't know if is actually clear enough to read? This big PDF should work: Pikas Methods for Bibliometric Institutional Profiles for the Practitioner

The submission has a little more about my motivation in the poster: Pikas Institutional Profiles MET17 (pdf)

The scripts are here:

Here's a lovely map I had to cut from the poster for size. Viewers may not appreciate that it is actually very unusual for us to collaborate outside of the US.

Affiliations of co-authors, sized by number of articles.

Comments are off for this post


Oct 27 2017 Published by under bibliometrics, Conferences

Edwin Henneken, Alberto Accomazzi, Sergio Blanco-Cuaresma, August Muench, Lars Holm Nielsen Asclepias – Capturing Software Citations in Astronomy

Asclepias project. Enabling software citation & discovery workflows. To "promote scientific software into an identifiable, citable, and preservable object. " Adding DOI based software citations to ADS. Tracking events.

Collaborative Codebase (GitHub) > Repository (Zenodo) > software broker (harvests repositry events, software citations

example ( - published in JOSS, 60 regular citations to, but also deposited in Zenodo. Citations to every single version of the software and a total of 100 citations.

Journals need to be able to accept software citations (actual citation to the software and not a related article). Just slapping a doi on it isn't enough.

End to end go from original proposal through all the data, papers, software, etc. and have analytics along the way.

Q: difficult to get people doing the right thing with the repositories? yes - but astro is amenable. long history of linking data

Q2: like bigger world of citing things not papers about things? yes

Eto MasakiIncreasing Source Documents of Rough Co-citation to Expand Co-citation Networks for Scientific Paper Searches

rough co-citation is a generation back from co-citation.

a + b cited together, co-citation... a+c cited together infer relationship with b, this did increase information retrieval retrieved documents that didn't exist in the network.

Pei-Ying Chen (speaker), Erica Hayes, Stefanie Haustein, Vincent Larivière, Cassidy R. Sugimoto -  Politics of platforms: the ideological perspectives of social reference manager users on scholarly communication

Looking at Mendeley and Zotero - hypothesis that Zotero users will be more to open data, etc., and Mendeley will be more traditional bcs using corporate platform.

Mendeley provided a stratified random sample of 26k users, response from about 1200. Zotero was an anonymous link advertised by Zotero at conferences.

In survey they didn't provide a category for librarians so they got a lot of "others"

From both groups: all advocate for open source software, all adopter of new technologies, most advocate for open access.

Majority of both think peer review system is broken and publishers aren't necessary for scholarly communication.

Some similarities and differences, but no real clear support for their hypothesis, as far as I could tell.

Q: try to look at the contents of the library to see if more oa or paywall journals?


Eshan Mohammadi, Mike Thelwall, Kristi Holmes - Interpret the meaning of academic tweets: A multi-disciplinary survey

Altmetrics - who uses twitter to communicate scholarly info, does twitter play an important role in communicating scholarly info, why, does it depend on discipline

twitter users who re/tweeted academic publications at least once using 4.5m twitter accounts

looked at personal web page urls 1.7 urls

using webmining, identified emailaddresses

sent online survey to 57k twitter users, got 2000 responses.

most respondents tweeting scholarly information were from the social sciences and humanities

most agree:

  • change way to read and disseminate sci info
  • twitter facilitates knowledge flows
  • reflects research impact
  • share academic findings with the general public

motivations for using and type of content shared depend on discipline, occupation and employment sector

They have a paper under review in a journal so stand by.


Philippe Mongeon Is there a Matilda effect in academic patenting?

We know men publish more papers than women and their papers are more cited

Now for patenting. Only about 15% of inventors are women. Patent-paper pairs. Same discovery published in a paper and patent

are women less likely to be inventor than men when we control for: position on the byline, discipline, reputation, contribution

Previous studies: no gender difference (Haeusslet & Sauermann, 2013), female more likely excluded from inventorship (Lissoni et al 2013)

all articles with 2 or more authors in wos 1991-2016, uspto patent applications 1986-2015

papers patents -1 to 5 years of app, all inventors on the actors list

text similarity of title and abstract.

discipline - based on discipline of journals cited by the paper

attribution of gender - based on Wikipedia pages (Berube in preparation)

automatic disambiguation of authors

accumulated number of citations at time of app.

contributions - manual extraction, where there were statements coded conception, analysis, performed...

regression models...

turns out place in author list has much more impact than gender, but gender is significant for all but engineering.

When taking contribution into account (many fewer papers), conception role is important  - which makes sense.

Small effect of gender on the attribution of inventorship, gender gap occurring earlier in the research process


Comments are off for this post


Oct 27 2017 Published by under bibliometrics, Conferences

This event was held Friday October 27, 2017

Kate McCain  - Undercounting the gift givers: issues when tallying acknowledgements in life sciences research

ongoing research effort - she originally worked on this 20 years ago but has come back to it recently. Background - model organisms - useful to organize research around. Community databases, stock centers, community databases, community ethos wrt sharing.

Ways to focus research - by journal is often done, but she uses this model organism. She is looking at 1980-2004 during growth phase when there is more sharing because nascent research area. And she is looking at acknowledgements.

Compared to citations - acknowledged most likely to be alive.

Personal ack vs. funding - she's interested in personal ackn. "peer interactive communication"

May be lots of different places: end note, methods section, end of text with no section label, ... No control or standardization of how people are named, what granularity they are thanked for, etc.

WoS mostly gets funding ack, and only secondarily sweeps up some personal ack (if they are in the same block, which is not always the case).

Undercounting big deal: text extraction relying on formal ack section. personal name disambiguation. Sampling or single year studies.

Check her slides to see what she found where. She also categorized types of ack - animals, software, data, editing, etc.

Top 15 individuals listed - first few time periods dominated by University of Oregon - founders and suppliers of fish early on.

She then went through profiles of some individuals with the diversity of how they appeared.

Trends - fewer examples of thanking for research materials - have their own, get from repository, or get from stock center

questions: manually - yes? learn things to help automate - yes, but lots and lots and lots of ways to trip up. Also just picking up surnames is not enough because then get some citations mixed in, named equations/methods, etc.

Reminds me of:

questions: in the lab outside of the lab. also tracking people who are frequently acknowledged and not often co-authors/cited

questions: comment - collaboration - set up something from PMC data (already coded in XML), but only using ack section and not the Materials & Methods (M&M) section.


Isabelle Dorsch - Relative Visibility

How well known. She's comparing personal publication list and information services (like WoS).

Relative visibility (IS) = (d/r)*100
d= in information services, r=publication list

Previous work - Cronin & Stock, and ISSI board study

Issues - finding the personal list, is it up to date and complete, is it structured to be used at all, what types of documents to keep (novels? newspaper articles?), keep in press?

(*discussion of this on SIGMETRICS really found that a combined edited list is probably best, but these aren't universally available - list maintained by information service but updated by author)

Which information service matters (of course)  -  visibility to one field when author publishes in multiple. Conference paper coverage, book coverage, etc.

questions: new author - only two publications - 100% (they only looked at established authors). Very dependent on the database

Judit Bar-Ilan - CiteScore vs JIF and Other Journal Indicators

Criticisms of JIF but still heavily used. Standard definition. Criticisms like lack of transparency. Things in the numerator not included as "citable items" in the denominator. Also now offer a 5year JIF

Citescore - publication window 3 years. They count all items so no numerator/denominator coverage mismatch. Transparent - can see all the citations that are covered. Freely available. Some criticism that covers too many different document types

EigenFactor, SJR, pagerank type indicators - more weight to more influential sources

Article Influence - normalized - average journal is 1.

She looked at correlations - for those sources that appear on most sources.

Quite high - CS-JIF 0.94,

HOWEVER - Lancet is 5 in JIF, 314 in CS - so huge differences and she suspects due to notes, editorials, etc.

Top 20 by CS are almost all review journals (Annual Review of... , Progress in... )

Eigenfactor doesn't include journal self-citation, and doesn't correlate as well with others.

Note also that even though high correlation, there are these big differences.

question: comment - real correlation between size of journal and JIF, Eigenfactor is the only one that corrects for this.


Student papers

Zhao, Mao, & Kun Lu (speaking, not student) - An Exploratory Study on Co-word Network Simulation

Network centrality and other network measures for co-word network. Are they correlated. Are there differences in disciplines in these measures. Looking at generative process of a co-word network.

Q: co-word can mean 3 different things: words that appear in the text, co-descriptor - uses carefully assigned things, keywords plus - is another thing separately (not controlled, but titles of articles cited). Are you simulating second hand natural language assigned things.

Antoine Archambault, Philippe Mongeon (speaking), Vincent Larivière  - The concentration of journal use in Canadian universities

As Canadian universities have to cut big packages due to budgetary issues.

Evaluating - downloads statistics from the 28 universities (~300 Excel files, 5M lines), references (articles written by authors at these universities citing these journals) perceived importance of journals (what journals do you perceive important to your research, your teaching) 23 of 28 universities, 5,500 participants (of which 3k from their university so actually disappointing response)

Cleaning important journals - title disambiguation, manual validation, classification by major disciplinary area (AH, SS, BM, NSE) - WoS, NSF, Ulrich's, Google, also verified research journal and not newsletter, etc.

47k unique journals.

Priority journals - 80/20 rule - anything in top of 80% downloads, references, mentions (10% of subscriptions account for 80% of any of these measures)

66% of the 47k journals are not in the top anywhere.

Broke out by publishers - Springer 80% of publications were not in anyone's top. Sage only 22% were not in anyone's

Only 41.6% overlap of core journals between universities

Correlation of cites, downloads, mentions (cites are super lengthy for institutions to do themselves can they just use downloads?) - answer is no. Have to use the 3 measures, not completely correlated.

Q: can you some sort of demand driven acquisition

Q: are there libraries of record - keep even if don't use locally

Q: combining visibility presentation earlier with this.

Christine Meschede Cross-Metric Compatibility of Altmetrics: Consistency of the Metrics from PlumX and

(posting before notes - battery going right now - will try to update)

Comments are off for this post

Brief notes from Maryland SLA's Storytelling with Data

This one-day meeting/course/workshop/seminar (?) was held at the University of Maryland (go Terps!) on October 12, 2017. As with all events planned by my local SLA chapter, it was very well organized and run. The speakers were all excellent. Amazingly, the parking was close and pre-paid. The food was great, too.

Keith Marzullo - the dean of the iSchool - gave some welcoming remarks. He was so positive and seemed to really get the point of the day.

The opening keynote was by Ya-Ling Lu from the National Institutes of Health library (not NLM but the campus library). I have mostly heard her speak tag-teaming with Chris Belter on bibliometrics techniques but it was wonderful to have the opportunity to hear a long presentation just by her on visualization. She talked about having a low floor - starting at the beginning - and a high ceiling - keep learning and improving.

She talked about learning design and how choices convey emotion and meaning. Her example was from Picture This: How Pictures Work by Molly Bang

  WorldCat link

It was amazing to see how simple rectangles and triangles, their color, size, and location really told the story.

She also provided examples of developing information products. The first was to celebrate the life and career of someone retiring. She needed data and visualizations and a story for people, research, and leadership.

A second example was graphing how she spends her day to try to find more time for the things she wants to do.

Finally, she skipped over an example of how she successfully fought a traffic ticket using data and visualizations.

Oh, and she often uses Excel for her visualizations - even when she can make them in R or Matlab.


Jessie Sigman from University of Maryland spoke next about using cytoscape and gephi to do graphs showing coverage of agricultural topics across research databases.

Vendor updates were provided by the sponsoring companies: Clarivate, Ebsco, and Cambridge University Press. CUP is doing a neat new thing that's sort of like Morgan & Claypool - it's like a monographic series, but the volumes are 40-70 pages. Peer reviewed and series are edited like journals.

David Durden and Joseph Koivisto of University of Maryland spoke next about the different stories that can be told with repository usage data. So it turns out that D-Space has separate data for the content (say PDF) and the metadata and integrating this mess to get a real, accurate picture of how the system is being used is a bit of a bitch. It's indexed by Solr, but Solr doesn't keep the same index number for the content - it assigns its own. Google Analytics does a lot, but maybe not the right things. RAMP, a project out of the University of Montana, helps with Google data but also has shortcomings. Things based on Google do the best they can to filter out bots. HOWEVER, if it's a bot a professor on campus wrote to analyze data, then that's a great use to track. Also Google doesn't capture the full text downloads.


Brynne Norton from NASA Goddard spoke of a cool visualization using interlibrary loan data. Standard statistics are just like time to get things filled and % requests filled. The data are horribly messy, with some citations lacking even an article title. She compiled the article titles using a series of regex searches and searched them through the Web of Science GUI. Yeah, the GUI. Apparently you can OR about 500 articles at a time! (as an aside: yes, there is indeed a WoS API, but you cannot use it for this purpose. You are only allowed to search for yourself. I know.) Then she loaded into VosViewer and did a topic map. It was really cool and she narrated how it showed certain areas they might consider collecting in.


Sally Gore did the closing keynote and boy is she awesome. I highly recommend librarians sign up for her webinar when SLA schedules it. She was also super encouraging. She spoke of how she figured out how to do these amazing infographics on her own - she even uses PowerPoint and sometimes draws her own icons. She recommended books by Stephanie Evergreen to learn design.  I have more notes, but they're at work and I'm trying to get this published - so I'll add if I find anything else I wanted to note

The closing remarks were actually terrible. The guy who gave them had not actually attended any of the day or really read the descriptions of the speakers. His comments were like on research data management which is irrelevant to the day's topic. Boo.

But then we drank wine and had some more food so it was ok 🙂

Comments are off for this post

Older posts »