lunedì, novembre 03, 2008

Google Now Performs OCR on Any Scanned PDF Documents

I was just reading on the Google blog that Google now performs OCR on any scanned PDF documents.

“In the past, scanned documents were rarely included in search results as we couldn't be sure of their content. We had occasional clues from references to the document-- so you might get a search result with a title but no snippet highlighting your query. Today, that changes. We are now able to perform OCR on any scanned documents that we find stored in Adobe's PDF format. This Optical Character Recognition (OCR) technology lets us convert a picture (of a thousand words) into a thousand words -- words that can be searched and indexed, so that these valuable documents are more easily found. This is a small but important step forward in our mission of making all the world's information accessible and useful.”