Notes from Dan Russell Advanced Skills for Investigative Searching

This class was held 4/15/2016 at University of Maryland at the Journalism School, hosted by the Future of Information Alliance. Some information is here. Slides are here. Updated Tip Sheet is here.

I've previously taken his MOOC and enjoyed tips on his blog but things change so quickly it was good to get an update.

Of course I didn't bring my laptop so... these are from handwritten notes.

  • Capitalization doesn't matter except for OR when it's crucial. don't use AND it doesn't do anything.
  • Diacriticals do matter. e and é are basically interchangeable but a and å are not. (it does offend native speakers of countries that use these....)
  • If you need to search for emoji you'll have to use Baidu. This is relevant searching for businesses in Japan, for example
  • filetype:   works for any extension. If you're looking for datasets you may use filetype:csv . Regular google searches don't search docs, you'll need to search them separately
  • site:  it's different if you use nyc.gov, www.nyc.gov, or .nyc.gov . To be most general use site:.nyc.gov that . after the : acts like a * if there are subdomains
  • There is no NOT. Instead use -<term>.  No space between the minus and the term.
  • Synonyms are automatic. Use quotes around a single term to search it verbatim (also turns off spell check for that term). If quotes are around a phrase, it does not do a verbatim search.
  • There are no stop words
  • inurl:   ... this is useful if pages have a certain format like profile pages on Google Plus
  • If you want to get an advanced search screen. Click on the gear to select it. Gear is in the upper right hand corner. That's the only way to get limiting by region (region limiting isn't always domain), number search, language search. Some advanced search things can also be gotten by using dropdown boxes after searching or using things like inurl: filetype:
  • related:<url> gets you sites with term overlap (not linking/linked similarity).
  • Google custom search engine  - lets you basically OR a bunch of site: searches to always search across them.

Image Search

  • Tabs across the top of results for topic clusters found
  • Search by image - click on camera and then point to or upload image. Can drag an image in or control click on an image. After search can then add in terms to narrow to domain.
  • Example - find a tool in the basement, take a picture on a white background with it in a normal orientation, then search to find it in catalogs, etc.
  • Crop images to the salient bit.
  • On mobile devices the standard search is actually a google appliance search - not as powerful. Open chrome and search from there if you need more.

Other notes

  • Things are changing all the time because of adversarial arrangements with optimization people.
  • link:   was removed this week.
  • results are an estimate. When you narrow you sometimes get more results because it starts by searching only the first tier of resources. First tier has millions of results in it - and the ones that have been assessed as highest quality. If it doesn't find enough in the first tier - like when you narrow a lot - it will bump down to the second tier with like billions more results
  • consider using alerts.
  • to find any of these services - just Google for them
  • google trends is interesting. can narrow by time or region. Also look for suggestions when searching. Can search for an entity or for search term. remember trends are worldwide
  • Google correlate - example: Spanish tourism authorities want to know what UK tourists are looking for. Find the search for Spain and tourism, and see what keywords use by UK searchers correlate.
  • Country versions are more than just languages. Consider using a different country version to get a different point of view.
  • Wikipedia country versions are useful for national heros and also controversial subjects (example: Armenian genocide)
  • define   (apparently no : needed)

I think all librarians should probably take his class. Good stuff.

No responses yet

Leave a Reply