Friday, February 09, 2001

The Scout Report - February 9, 2001 Google Now Indexes PDF Files
http://www.google.com/
The indomitable Google has recently begun indexing content in .pdf files, allowing searchers a significant peek into
the "invisible Web," the large area of online content not covered by most search engines. PDF files are differentiated
by a [PDF] label and instead of a cached version, Google provides a link to a plain text version of the document.
Keeping a plain text version allows Google to apply its PageRank technology and integrate .pdf content with normal
search returns. Test searches did not turn up a large number of .pdf files, but adding "pdf" to the query produced a
more significant proportion in the returns, although they were not always the majority. [MD]
http://scout.cs.wisc.edu/report/sr/2001/scout-010209.html

No comments:

Post a Comment

con·cept