Submit Hint Search The Forums LinksStatsPollsHeadlinesRSS
14,000 hints and counting!

10.4: Use Preview to set keywords in PDFs System 10.4
Tiger only hintThe updated that comes with Tiger now offers a few extra features when used with PDF files. Besides the ability to do simple image editing and annotations, you are now able to add your own keywords to the internal properties header of a PDF file. Why is this useful?

Because not all scanners or scanning solutions do OCR. In my case, I have an extremely fast page scanner that will take a dozen or so pages and spit out a PDF in about a minute. However, the scans only contain a graphic image of the pages, not the actual text content of the file (and therefore invisible to Spotlight). This isn't as bad as it sounds, particularly if you're scanning your documents primarily for archival reasons (i.e. getting rid of mountains of paper clutter) .... or at least that's what I told myself until Spotlight was released. Now that Tiger has the ability to search for material based on content, I've begun looking for third party apps to take my previously scanned PDFs and add text content by OCR.

While I haven't decided upon a final solution yet, there is another option I'm considering that others may find useful: Instead of using OCR, manually add your own keywords to the internal properties stored along with every PDF file.

Prior to Tiger, one of the easier ways to do this was the shareware app PDFPen, which allows you to fill in the author, subject, and keyword fields in a PDF document. PDFPen is a great app that I continue to use, particularly for its ability to rearrange, insert, and delete individual pages in a single PDF file. But as it turns out, one of the stated improvements in Tiger's is this same capability. Tools -> Get Info : Keywords tab. Whatever keywords are added to a PDF will then be visible from Spotlight.

Granted it's not as easy nor as automatic as a full OCR of a document, but at least it does allow you to specify and target your PDFs internal metadata a bit more accurately than just relying on the filename, etc.
  • Currently 3.17 / 5
  You rated: 5 / 5 (6 votes cast)

10.4: Use Preview to set keywords in PDFs | 3 comments | Create New Account
Click here to return to the '10.4: Use Preview to set keywords in PDFs' hint
The following comments are owned by whoever posted them. This site is not responsible for what they say.
10.4: Use Preview to set keywords in PDFs
Authored by: wangman25 on Jun 06, '05 10:40:03PM

Check out Readiris 9.0, it works well and accepts most file formats.

[ Reply to This | # ]
Authored by: victory on Jun 08, '05 09:07:42AM

Thanks for the suggestion.

By coincidence, I actually do use Readiris Pro 9.0 to OCR documents, but there are a number of gripes I have with the app. In particular, like pre-6.0 versions of Adobe Acrobat, it seems to have a built-in limit of 50 pages per document that it will OCR (Granted, the 'corporate' version of Readiris may not have this limitation). There also seems to be a weird glitch that causes it to only recognize the first few pages of a multi-page PDF at times.

On the other hand, Readiris does OCR documents a lot faster than Acrobat's built-in Paper Capture feature. (And is a lot cheaper)

About the only other major OCR package available for OSX I haven't looked at yet (I've even tried a few open-source offerings without much success) is Omnipage. Google'd reviews of the OSX version aren't encouraging, which is sad since this app was an early pioneer in PC-based OCR.

[ Reply to This | # ]
Authored by: victory on Jun 08, '05 05:19:01PM

Since a few others have asked what scanner I'm using (mentioned in the hint), it's a Fujitsu ScanSnap FI-5110EOX. For more info, see:

NOTE: While this is a *great* scanner and I have nothing but praise for it, note that it is not 'officially' supported for OSX by Fujitsu at this time (if you're in a situation where that matters). [I am not connected in any way with any of the aforementioned products, blahblahblah.]

[ Reply to This | # ]