What's old is new again

(by Christina Pikas) Feb 10 2016

Everybody's back starting up an online community for their publishing platform. IEEE with Collabratec. ACS with ChemWorx. Science has one, too.

Seems like everyone did this 15 years ago. The only difference now seems to be the addition of authoring tools. We'll see.

(I posted about ChemWorx before)

Defense slides

(by Christina Pikas) Jan 30 2016

Took me a bit - I forgot to upload them to SlideShare until just now. I did pass with revisions to be approved by my advisor.

I have to tell you that it was really anticlimactic. I thought it would be a big weight off my shoulders and I would feel free and I would have minor quibbles but lots of pats on the back... but... well... I don't know.  This massive framework o' mine? The communications prof thought it was exactly the same as Shannon and Weaver (1948). Wow.

At least when I do these edits I can get on with writing up other work I've done and then prepping pieces of this for publication. So, really, no less work, but different.

I do fully intend to make this freely available with creative commons attribution and all that. The whole dissertation. I am going to do the revisions first, though, because some are pretty big.

How to get unbound by non-forward thinking users...

(by Christina Pikas) Jan 16 2016

Last post I described a system that was stuck by its own commitment to user-driven development. They're really stuck. So what are possible ways out? Particularly for a government system?

I really don't know and particularly for a government system but that doesn't mean I can't think about it.

One thought was that maybe they need to make their case more clearly. How could they describe the projects better to make them more attractive in the rankings? This is probably impossible and maybe even insulting as they probably tried very hard to get their point across in the past. They seemed frustrated. Of course, they could hire a consultant to tell them exactly what they already knew - some people will listen to consultants.

I was wondering if acquisition rules would allow them to set aside like 20% or something to do their projects - ones that they thought were best but not necessarily voted on by the users. This would work for things that were less expensive to do or could be piloted.

Part of the problem is that the system may need to be re-architected and might need major redesign. Some of the pieces can be kept, but need to be integrated. That would have to wait for the next major version. Maybe if their key software underneath has to be upgraded, they could use that as a reason to do some things?

Sigh. I don't know. It sure is easier just to dream of a cool system.

When listening to the users may not be the best thing

(by Christina Pikas) Jan 16 2016

At work we evaluated the fitness of one large collaboration platform for use for another group. The government was already funding this one big thing and it made sense to see if it could be leveraged instead of starting from scratch even though the potential user groups are extremely different.

The system we evaluated was carefully designed with lots of input from user groups, by well meaning, competent people, using best practices from the field. GAO has fussed at them a few times over the years for the same things they always pick up on and there are always questions about if their system is used enough and how and what contracts they have let and for how much. They have a roadmap for development that is carefully developed in coordination with the users and they use agile development with frequent small releases and quarterly larger releases. There's lots of training available both ad hoc, recorded, and live as well as in person presentations at conferences and the like.  They have a bunch of case studies in which the system has had a pivotal role in supporting collaboration and solving a difficult problem for the users.

Sounds great, right? The only thing is that the actual system is pretty ugly and not all that functional - certainly not what we had been designing with our ambitious state of the art system. We asked about things like how access control is done, how information is organized and retrieved, how content management is done, what the portal does, how it supports communication and collaboration... all fell very far short of our expectations. How could this be? We were looking at current features in products on the market - we even looked at products they have.

In my opinion (not anyone else's), it's all about their users and their governance. They have proposed many of the things we want in our system and their users de-prioritize all of them and do not chose to fund them. You see, a lot is needed for really good content discovery - there's a lot of infrastructure, which is invisible to the user (see Star's stuff on infrastructure). There's a lot of humans developing and training information organization schemes and building ways to ingest and process information such that search works. There are the policy requirements in a federated system like this to allow these various repositories to be searched. There's ongoing maintenance and user testing and ranking and boosting and troubleshooting for even a decent search to work, not to mention the full content discovery.

So the professionals propose projects to work on these things and improve them, but the users - who are expert in an ENTIRELY different area - are not getting it and are not trusting the professionals. And money is always limited. So the communication pieces aren't integrated. There's not fine role based access control. There's no way to search across various things... But their users are happy and are getting EXACTLY what they asked for.

So how do you design a governance system and development for a massive collaboration system such that it is user-based and need-based, but you still can fund infrastructure work needed to provide the functions for the users. I don't know. We laid off our taxonomist because management thought our search tool did all that itself - it doesn't.  Clearly we don't know how to make the case, either.

Is there hope? If the two systems are joined, might the developers leverage our information to force some of these improvements? Dunno.

The crazy trip of one aerospace trade pub

(by Christina Pikas) Jan 08 2016

Aviation Week (& Space Technology) is celebrating 100 years in print in 2016. To celebrate (and advertise Boeing), they are making their archive freely available in 2016 (registration required for some f/t) (via Gary Price).

This is shocking really. This publication has been super important AND super expensive over my time as an engineering librarian in an aerospace organization. I've used the print archives to come up with open source documentation of various launch pad accidents, details of missile production, and other news.

We had a print archive going back to about 1962 in our library. In 2009 when we had to move out and ditch our collection I argued to keep this - of all of our bound journals - because at the time there was no affordable alternative. Well, I was out the day one of the jobbers came and apparently took it all anyway. An accident, I was told. So we did without for a while, using the embargoed access that gave us a few years through a major aggregator.

Still, just about every year we asked for a quote from McG-H and the pricing was like 5 digits for a single user with a login and password. It wasn't something we could really do.

Then a couple of years ago it was bought by Penton Media. You may know them from all of the "free to qualified recipients" trade pubs they have. They offered the database - the Intelligence Network - to us large-institution wide for less than a single login had been. We jumped on it.

Now, this year, free. I guess it won't be free after 2016? Has the quality changed? Dunno.

The value of blogging and goals for 2016

(by Christina Pikas) Jan 04 2016

As I prepare the slides and review my dissertation in preparation for the defense on the 19th, I keep coming back to the assertions I made in 2004 about the value of blogs for personal knowledge management. More recently, Pat Thomson blogged on THE about the value of blogging to scholarly writing. I think the value of blogs for communicating with the public is probably oversold. Seems like a lot of the scientists, social scientists, and other scholars who go into blogging with that goal find that they are instead communicating with like minds - scientists in other research areas, teachers, hobbyists/enthusiasts/citizen scientists - instead of changing minds and informing the uninformed.

It's not that there aren't cases in which that's true, it's just probably not frequent or widespread enough to sustain involvement for a new blogger.

I also don't mean to imply that there's no social in the social software. The community built through blogging can be very rich and supportive. The feedback on blogs can be very helpful.

I do miss blogging more and I don't think that blogging less has made me more productive offline. Instead, I find writing very slow and tedious and I get very frustrated that readers of my work are not able to understand me through it.

So. I'm going to try to be here more. I'm going to try to practice writing more. I'm going to try to do more research blogging. I'll also try to capture and share any neat tricks in analysis.


Some further reading:

Dennen, V. P. (2014). Becoming a blogger: Trajectories, norms, and activities in a community of practice. Computers in Human Behavior, 36(0), 350-358. doi:10.1016/j.chb.2014.03.028

Hank, C. (2013). Communications in Blogademia: An Assessment of Scholar Blogs’ Attributes and Functions New Review of Information Networking, 18(2), 51-69. doi:10.1080/13614576.2013.802179

Mewburn, I., & Thomson, P. (2013). Why do academics blog? An analysis of audiences, purposes and challenges. Studies in Higher Education, 38(8), 1105-1119. doi:10.1080/03075079.2013.835624

Olive, R. (2013). ‘Making friends with the neighbours’: Blogging as a research method. International Journal of Cultural Studies, 16(1), 71-84. doi:10.1177/1367877912441438

2015 in Review

(by Christina Pikas) Jan 04 2016

Well, 2015 was sort of a meh year for me. Definitely on the blog.

January: Using more of the possible dimensions in a network graph - I was glad I shared this and glad I was able to make it work in the first place.

February: So... um... what if I'm still enjoying it? - about my dissertation.

March: Polar and Ellipsoid Graphs in iGraph in R

April: Which are the bestest? Top articles from a diverse organization - part 1 - never did part 2 AND still need to write this up for publication

May: ACS and Just Accepted Manuscripts

June: Notes from a presentation on library spaces by Keith Webster

July: none

August: Why special librarians should be active on their organization's intranet social media - the title of the post is not really descriptive. This is a research blogging post about the use of social media on a company's intranet.

September: The smart phone and parenting children - two articles

October: "Theory" for the immigrant to social sciences

November: Citation Manager Frustration - I actually had 3 posts on the same date, but this is the most important. I really don't like the way things are going with citation managers. As an update: the folks from RefWorks did contact me and I described a bunch of the issues. I think they'll have other ways to solve the same problems I'm encountering than what I proposed but they definitely seemed interested.

December: Bibliometrics: Getting an accurate (+/-) count of articles published by an organization in a year

I'm shocked that I posted at least something every month but July.


Bibliometrics: Getting an accurate (+/-) count of articles published by an organization in a year

(by Christina Pikas) Dec 02 2015

As part of a benchmarking activity, I'm comparing our scholarly output with a few of our peers. Since we're comparing and we're normalizing by number of technical professional staff, we're not particularly concerned with being absolutely comprehensive.

Our strategy is to use Web of Science* and Scopus* and to use their profile page for each (instead of developing our own searches for each which would be more comprehensive but perhaps not evenly). I export these records into (now) EndNote* client and de-dup from there. As previously mentioned, I need to have a more powerful way to de-dup.

Here are some strange findings that really do impact the results that should be attended to if you want to get a decent number:

  • Book chapters for books with only one author. I do get the idea of a book chapter or two from the same book counting as individual items. In this case, coming from WoS, there were >50 chapters (including "introduction to section III") all from the same author. I will count that as 1 book, not as 50+ articles
  • Errata: should be easy to weed out yet apparently still came through.  I definitely don't think these should be counted as a separate contribution!
  • The case of OSA and the 6 conference papers.

Take a look at this screenshot:

Screenshot of OSA search results

Results from OSA search (https://www.osapublishing.org/search.cfm?q=Instantaneous%20multiplex%20imaging%20in%20reacting%20flows&meta=1&cj=0&cc=1)

There are 6 identical papers showing as having been presented at 6 different conferences. Scopus exported these as 6 different contributions. When I contacted OSA, they told me that this was correct as the paper was presented at one of these conferences that were all collocated. I'm sure the author only thinks of this as one paper, but if I had accepted the number uncritically or not de-duplicated, I wouldn't have noticed.

It's worth pointing out, too, that articles with Greek characters in their titles come through to EndNote* in several different ways making them appear distinct to the algorithm.

FWIW, I do know that typically bibliometric projects only use journal articles and one database to avoid these problems; however, a large portion of our output is in engineering fields in which conferences play an important role. I also know that I'm probably unfairly under counting in Computer Science by using these databases. I don't think that unfairly targets any one of the individual organizations in the benchmarking study.

Another issue of concern is the conference paper that is expanded and reprinted in a journal (I count that) vs. meeting abstracts that appear in a journal (I do not count these).



*Not affiliated, not an endorsement, yadda, yadda

Citation Manager Frustration

(by Christina Pikas) Nov 20 2015

I've used most of the major citation managers: RefWorks, Flow, EndNote (web and client), Zotero, Mendeley, ProCite (remember that one?). I've looked at Papers. I've dabbled with BibTex using a couple of different tools. I've watched the videos on the new RefWorks 3 coming out. I've given training on RefWorks and EndNote... and I'm frustrated.

Almost all of these have the model that you are one person, with one field of research, who will continue to use the same somewhat limited number of references over the many years.

As a librarian, I like to compile references for people into RefWorks collections and then turn them over. I've done this by setting up new accounts for them. This will not be possible in the new version in which there's only one account per e-mail.  Sharing folders doesn't work because I don't want my dissertation and professional work database to be crammed with thousands of unrelated articles.

I have tried to collaborate with people in Zotero and I can't seem to get rid of a thousand or so articles that were relevant to a project from several years ago.

Further, they only get worse at deduplication. I use a citation manager to compile references and deduplication for bibliometrics. RefWorks chokes when your database gets above a few thousand. Plus, it's not configurable. You can't say ignore title, just look at year, authors, publication or something like that. Or only look at title. You can't review duplicates, find that they're part 1 and part 2 or conference presentation and journal paper, and then have them not show up every time. These are really not intended for my uses.

So I'm going to go back to trying client software. First I'll try a EndNote X ( an ancient copy that has been sitting here in a box still sealed). Then I'll probably go to BibTeX, maybe using SVN, and maybe using code for some of the tricks. Why should I have to?

And that's another reason to get my @##% dissertation done while RefWorks and the Word plugin still work on my home computer.

Slides from Leveraging Data to Lead

(by Christina Pikas) Nov 20 2015

This was a great conference put on by Maryland SLA. I tweeted at bit using the hashtag: #datamdsla

Here's my slides. Not awesome but I did find some nice pictures :)


