By Harry McCracken | Monday, September 8, 2008 at 3:30 pm
I’m having a good time here at the DEMOfall conference in San Diego, but there’s stuff being announced at the TechCrunch50 back in San Francisco, too–and TechCrunch scored a coup this morning when Google’s Marissa Mayer used the conference to announce that the company is working with newspapers to make millions of pages of old newspapers searchable in their original form.
When I was in college, a few years before the Web came along, I spent lots of time in the library reading old newspapers in microform form, and what Google is doing here instantly reminded me of those days. In fact, it looks like Google’s newspaper archive is the somewhat grainy black-and-white photographs of papers I remember cranking through. Except now, you can do full-text searching of a vast repository of ’em.
Since I wasn’t at TechCrunch, I read Google’s blog post about the news, and found it a tad confusing. It suggested searching for “Americans walk on moon” but didn’t say where to do it, so I just went to Google Web search. The #1 result was for a site run by a Rastafarian conspiracy theorist–which is not only not the result that Google talks about in the blog post, but a rare example of Google’s first result for a significant search being shoddy.
Turns out that you search for the newspapers in Google News Archive. Even knowing that, I’m still a little confused: I searched for “Americans walk on moon” there and got a couple of scanned newspapers and a bunch of for-pay newspaper stories on the first page of results, but not the result from the Pittsburgh Post-Gazette that Google spotlights in its blog result. I’m sure I’ll figure it out.
When I just clicked on the example link in the Google blog post to go directly to the Post-Gazette’s “Americans walk on moon” results, I was impressed: I got the entire paper for July 21st, 1969, in a viewer that let me pan around and zoom in and out:
The Google blog post doesn’t say how many newspapers are in the archive so far, but it looks like it might be relatively few. I tried searches for bill clinton impeached, fdr dies, red sox win world series, enron scandal, space shuttle explodes, bill gates retires, and america invades iraq, and didn’t get any of the scanned newspapers in the results on the first page for any of them. (Google says that it identifies the scanned papers with a “Google News Archive” label–kinda confusing given that you do the searching within something called Google News Archive, which includes results of multiple types.)
Actually, the Google News Archive results in general aren’t uncanny in the way you expect Google results to be uncanny–for instance, the first result for “america invades iraq” is from an issue of The Washington Monthly published months before America did invade.)
Google says that as it adds more papers to the archive, you’ll see links to them show up in ordinary Google Web search results; that’s pretty exciting, since in many cases original newspaper results are as relevant or more so than anything on the Web. So the prospect remains exciting even though today’s news doesn’t seem to involve an immediate transformation of Google News Archive, let along the rest of Google.
Oh, and when you do find scanned newspapers in your results, you can do what I used to spend a lot of time doing back in college–read old comics:
September 9th, 2008 at 8:23 pm
How far are they going back? http://www.nambour-chronicle.com archives the Nambour Chronicle & North Coast Advertiser from 1903.
September 24th, 2008 at 4:28 pm
The Nambour Chronicle & North Coast Advertiser was first published 31st July 1903 and continued as the local newspaper for the Sunshine Coast (region north of Brisbane. Australia)until 1983. It has been scanned from microfilm and made available, in digital format, the entire full text run of this newspaper from 1903 to 1955. Researchers are now able to search the entire paper by keyword or issue date and can download and print directly from the paper. This provides easy and convenient access to this valuable historical newspaper. Not on the scale of Google’s efforts but not bad for a local Council.
January 10th, 2011 at 1:12 am
Took me awhile to see all the comments, but i truly enjoyed the article. It proved to be very useful to me and i know that to all the commenters right! It’s always nice unsuitable for your needs not only be prepared, but also engaged! I’m sure you needed fun writing this write-up.
March 16th, 2011 at 3:47 am
Hello Way cool info, will have to try it. Just wanted to let you know, the link is broke. Can you fix it please? Thanks again for taking the time to put this up. I certainly enjoyed every part of it.