Notes from International Symposium on Science of Science 2016 (#ICSS2016) - Day 1

This conference was held at the Library of Congress March 22 and 23, 2016. The conference program is at: http://icss.ist.psu.edu/program.html

I had the hardest time remembering the hashtag so you may want to search for ones with more C or fewer or more S.

This conference was only one track but it was jam-packed and the days were pretty long. On the first day, my notes were by hand and my tweets were by phone (which was having issues). The second day I brought a power strip along and then took notes and tweeted by laptop.

One thing I want to do here is to gather the links to the demo tools and data sets that were mentioned with some short commentary where appropriate. I do wish I could have gotten myself together enough to submit something, but what with the dissertation and all. (and then I'm only a year late on a draft of a paper and then I need to write up a few articles from the dissertation and and and and...)
Maryann Feldman SciSIP Program Director

As you would expect, she talked about funding in general and the program. There are dear colleague letters. She really wants to hear from researchers in writing - send her a one-pager to start a conversation. She funded the meeting.

Katy Börner Indiana University

She talked about her Mapping Exhibit - they're working on the next iteration and are also looking for venues for the current. She is interested in information analysis/visualization literacy (hence her MOOC and all her efforts with SCI2 and all). One thing she's trying now is a weather report format. She showed an example.

She did something with the descriptive models of the global scientific food web. Where are sources and where are sinks of citations?

Something more controversial was her idea of collective allocation of funding. Give each qualified PI a pot of money that they *must* allocate to other projects. So instead of a small body of reviewers, everyone in the field would be a reviewer. If the top PI got more than a certain amount. They would have to re-allocate to other projects.

I'm not sure I got this quote exactly but it was something like:

Upcoming conference at National Academy of Science on Modeling Sci Tech Innovations May 16-18.

They have a data enclave at Indiana with research data they and their affiliates can use. (I guess LaRiviere also has and has inherited a big pile o'data? This has been a thought of mine... getting data in format so I could have it lying around if I wanted to play with it).
Filippo Radicchi Indiana University

He spoke about sleeping beauties in science. These are the articles that receive few citations for many years and then are re-discovered and start anew. This is based on this article. Turns out the phenomenon occurs fairly regularly and across disciplines. In some cases it's a model that then is more useful when computing catches up. In other cases it's when something gets picked up by a different discipline. One case is something used to make graphene. He's skeptical one of the top articles in this category is actually being read by people who cite it because it's only available in print in German from just a few libraries! (However, a librarian in the session *had* gotten a copy for a staff member who could read German).

I would love to take his 22M article data set and try the k-means longitudinal. If sleeping beauty is found often, what are the other typical shapes beyond the standard one?

He also touched on his work with movies - apparently using an oft-overlooked section of IMDB that provides information on references (uses same framing as x, adopt cinematography style of y, remakes z... I don't know, but relationships).

Carl Bergstrom University of Washington

The first part of his talk reviewed Eigenfactor work which should be very familiar to this audience (well except a speaker on the second day had no idea it was a new-ish measure that had since been adopted by JCR - he should update his screenshot - anyhoo)

Then he went on to discuss a number of new projects they're working on. Slides are here.

Where ranking journals has a certain level of controversy, they did continue on to rank authors (ew?), and most recently articles which required some special steps.

Cooler, I think was the next work discussed.  A mapping technique for reducing a busy graph to find patterns. "Good maps simplify and highlight relevant structures." Their method did well when compared to other method and made it possible to compare over years. Nice graphic showing the emergence of neuroscience. They then did a hierarchical version. Also pretty cool. I'd have to see this in more detail, but looks like a better option than the pruning and path methods I've seen to do similar things. So this hierarchical map thing is now being used as a recommendation engine.  See babe'.eigenfactor.org . I'll have to test it out to see.

Then (it was a very full talk) women vs. men. Men self-cite more. Means they have higher h-index.
Jacob Foster UCLA (Sociology)

If the last talk seemed packed. This was like whoa. He talked really, really fast and did not slow down. The content was pretty heavy duty, too. It could be that the remainder of the room basically knew it all so it was all review. I have read all the standard STS stuff, but it was fast.

He defines science as "the social production of collective intelligence."

Rumsfeld unknown unknowns... he's more interested in unknown knowns. (what do you know but do not know you know... you know? 🙂 )

Ecological rationality - rationality of choices depends on context vs rational choice theory which is just based on rules, not context.

Think of scientists as ants. Complex sociotechnical system. Information processing problem, using Marr's Levels.

  • computational level: what does the system do (e.g.: what problems does it solve or overcome) and similarly, why does it do these things
  • algorithmic/representational level: how does the system do what it does, specifically, what representations does it use and what processes does it employ to build and manipulate the representations
  • implementational/physical level: how is the system physically realised (in the case of biological vision, what neural structures and neuronal activities implement the visual system)

https://en.wikipedia.org/wiki/David_Marr_(neuroscientist)#Levels_of_analysis

Apparently less studied in humans is the representational to hardware. ... ? (I have really, really bad handwriting.)

science leverages and tunes basic information processing (?).. cluster attention.

(incidentally totally weird Google Scholar doesn't know about "american sociological review" ? or ASR? ended up browsing)
Foster,J.G., Rzhetsky,A., Evans, J.A. (2015) Tradition and Innovation in Scientists’ Research Strategies. ASR 80, 875-908. doi: 10.1177/0003122415601618

Scientists try various strategies to optimize between tradition (more likely to be accepted) and innovation (bigger pay offs). More innovative papers get more citations but conservative efforts are rewarded with more cumulative citations.

Rzhetsky,A.,Foster,I.T., Foster,J.G.,  Evans, J.A (2015) Choosing experiments to accelerate collective discovery. PNAS 112, 14569–14574. doi: 10.1073/pnas.1509757112

This article looked at chemicals in pubmed. Innovative was new ones. Traditional was in the neighborhood of old ones. They found that scientists spend a lot of time in the neighborhood of established important ones where they could advance science better by looking elsewhere. (hmmm, but... hm.)

The next bit of work I didn't get a citation for - not even enough to search - but they looked at JSTOR and word overlap. Probabilistic distribution of terms. Joint probability. (maybe this article? pdf). It looked at linguistic similarity (maybe?) and then export/import of citations. So ecology kept to itself while social sciences were integrated. I asked about how different social sciences fields use the same word with vastly different meanings - mentioned Fleck. He responded that it was true but often there is productive ambiguity of new field misusing or misinterpreting another field's concept (e.g., capital). I'm probably less convinced about this one, but would need to read further.

Panel 1: Scientific Establishment

  • George Santangelo - NIH portfolio management. Meh.
  • Maryann Feldman - geography and Research Triangle Park
  • Iulia Georgescu, Veronique Kiermer,Valda Vinson - publishers who, apparently, want what might already be available? Who are unwilling (except PLOS) or unable to quid pro quo share data/information in return for things. Who are skeptical (except for PLOS) that anything could be done differently? that's my take. Maybe others in the room found it more useful.

Nitesh Chawla University of Notre Dame

(scanty notes here - not feedback on the talk)

Worked with Arnet Miner data to predict h-indices.

Paper: http://arxiv.org/abs/1412.4754 

It turns out, that according to them, venue is key. So all of the articles that found poor correlation between JIF and an individual paper's likelihood of being cited.. they say actually a pretty good predictor when combined with researcher's authority. Yuck!

Janet Vertesi Princeton University

Perked up when I realized who she is - she's the one who studied the Rover teams! Her book is Seeing Like a Rover.  Her dissertation is also available online, but everyone should probably go buy the book.  She looked at more a meso level of knowledge, really interested in teams. She found that different teams - even teams with overlapping membership - managed knowledge differently. The way instrument time (or really spacecraft maneuvering so you can use your instrument time) was handled was very different. A lot had to do with the move in the '90s for faster...better... cheaper (example MESSENGER). She used co-authoring networks in ADS and did community detection. Co-authorship shows team membership as same casts of characters writing. This field is very different from others as publications are in mind while the instruments are being designed.

She compared Discovery class missions - Mars Exploration Rover - collectivist, integrated; everyone must give a go ahead for decisions; Messenger - design system working groups (oh my handwriting!)

vs. Flagship - Cassini - hierarchical, separated. Divided up sections of spacecraft. Conflict and competition. Used WWII as a metaphor (?!!). No sharing even among subteams before release.  Clusters are related to team/instrument.

New PI working to merge across - this did show in evolution of network to a certain extent.

Galileo is another flagship example. breaks down into separate clusters. not coordinated.

Organization of teams matters.

I admitted my fan girl situation and asked about the engineers. She only worked with scientists because she's a foreign national (may not mean anything to my readers who aren't in this world but others will be nodding their heads).  She is on a team for an upcoming mission so will see more then. She also has a doctoral student who is a citizen who may branch off and study some of these things.
Ying Ding Indiana University

She really ran out of time in the end. I was interested in her presentation but she flew past the meaty parts.

Ignite Talks (15s per slide 2min overall or similar)

  • Filippo Menczer - http://scholarometer.indiana.edu/ - tool to view more information about authors and their networks. Browser extension.
  • Caleb Smith,
  • Orion Penner - many of us were absolutely transfixed that he dropped his note pages on the floor as he finished. It was late in the day!  He has a few articles on predicting future impact (example). On the floor.
  • Charles Ayoubi,
  • Michael Rose,
  • Jevin West,
  • Jeff Alstott - awesome timing, left 15 for a question and 15 for its answer. Audience didn't play along.

Lee Giles Penn State University

It was good to save his talk for last. A lot going on besides keeping CiteSeer up and running. They do make their data and their algorithms freely available (see: http://csxstatic.ist.psu.edu/about ) . This includes extracting references. They also are happy to add in new algorithms that make improvements and work in their system. They accept any kind of document that works in their parsers so typically journal articles and conference papers.

RefSeer - recommends cites you should add

TableSeer - extracts tables (didn't mention and there wasn't time to ask... he talked a lot about this for chemistry... I hope he's working with the British team doing the same?)

Also has things to extract formulas, plots, and equations. Acknowledgements. Recommend collaborators (0 for me, sniff.) See his site for links.

 

 

Tags:

2 responses so far

  • Stian Haklev says:

    Just a quick appreciation for this post! I’m very interested in open access/science and ways of improving scientific communication. I’ve spent some time in more activist circles, but am not very familiar with scientific research on how science gets done (teams etc). Very fascinating, I really appreciate you sharing these brief summaries.

  • Robin Sinn says:

    Thanks! I had considered attending, but didn't have the time.

Leave a Reply