Submit Hint Search The Forums LinksStatsPollsHeadlinesRSS
14,000 hints and counting!

Use FileMerge to diff PDF files Apps
As a university teacher, I often get to read new versions of student essays, and I want to be able to quickly see what changes there are from the old version. Here's how to use the FileMerge application (included in the Xcode developer tools) to view PDF files. In addition, you'll also need the command line utility pdftotext, which you can get from MacPorts.

First you need to install the pdftotext command line utility. It's included in both the Poppler and the Xpdf rendering libraries -- either one will do. With Macports installed, just open a Terminal window and execute one of these commands:
$ sudo port install poppler
$ sudo port install xpdf
Launch FileMerge (in Developer » Applications » Utilities), and open its Preferences. On the Filters tab, create a new filer by clicking in an empty row. The Extension should be pdf, and set the Filter to:
pdftotext -layout -nopgbrk -enc Latin1 $(FILE) -
Now you can diff PDF files. Note that you can write opendiff on the command line to start FileMerge:
$ opendiff oldfile.pdf newfile.pdf
Note that there's a problem with the simple solution above: FileMerge wants the text files to be in Mac-Roman encoding (why, Apple, why?), but pdftotext cannot export to Mac-Roman.

The solution to this problem is to use the built-in command iconv for this conversion. However, this results in a shell script pipe, which FileMerge cannot handle. So, this is what we have to do. Create a file (e.g., convert_pdf_to_macroman_text.sh) with the following contents:
#!/bin/sh
pdftotext -layout -nopgbrk -enc UTF-8 "$1" - | iconv -c -f UTF-8 -t MacRoman
Put it somewhere (e.g., in a directory ~/bin) and make it executable. In the Terminal, you do that like this:
$ chmod a+x ~/bin/convert_pdf_to_macroman_text.sh
Now you can set the Filter for PDF files in FileMerge to:
~/bin/convert_pdf_to_macroman_text.sh $(FILE)
Now you can view non-ASCII symbols too (such as Swedish ┼─Í).
    •    
  • Currently 2.52 / 5
  You rated: 2 / 5 (23 votes cast)
 
[20,107 views]  

Use FileMerge to diff PDF files | 12 comments | Create New Account
Click here to return to the 'Use FileMerge to diff PDF files' hint
The following comments are owned by whoever posted them. This site is not responsible for what they say.
Use FileMerge to diff PDF files
Authored by: Geobunny on Dec 01, '09 08:05:22AM
Sledgehammers and walnuts spring to mind....but hey, that's what we techy types love!

Two things I've learned from this hint:
1) My beloved FileMerge can diff PDFs.
2) There's a command line tool to initiate it, which is superb!


[ Reply to This | # ]
Binaries of xpdf
Authored by: mael on Dec 01, '09 09:07:15AM

The Binaries for xpdf can be found here: http://users.phg-online.de/tk/MOSXS/
(At the bottom).



[ Reply to This | # ]
MacRoman to different encoding, possible?
Authored by: vniks on Dec 01, '09 09:21:39AM

I am trying to run PDF Resizer and cannot resize it because it has MacRoman encoding. Any method to bypass these or change them to other encoding via commandline??

Thanks

VNiks

Edited on Dec 01, '09 09:22:16AM by vniks



[ Reply to This | # ]
Use FileMerge to diff PDF files
Authored by: unforeseen:X11 on Dec 01, '09 11:01:45AM
As mentioned in the other hint - FileMerge does not require MacRoman, it can handle a broad variety of file encodings. Not so however if the file encoding attribute of the file is not correctly set, which appears to be the problem here.
Here's how you can change the file encoding attribute to be correct:
xattr -w com.apple.TextEncoding 'UTF-8;134217984' filename
(Boldly taken from here.)
---
this is not the sig you`re looking for.


[ Reply to This | # ]
Use FileMerge to diff PDF files
Authored by: jayhawksean on Dec 02, '09 01:08:53PM

This is a nice hint for comparing PDFs, but why would a university instructor require students to turn in PDFs? Why not just use RTF? Open with Microsoft Word and compare documents (my university and most others will give you Office, no?). i like to use Textmate and a misc bundle: [link]http://manual.macromates.com/en/bundles[/link] (w/ the diff command in the terminal).



[ Reply to This | # ]
Use FileMerge to diff PDF files
Authored by: hut66au on Jun 01, '10 04:24:58AM

I'm not the original submitter, but as a University lecturer I try to encourage my students not to rely on Office, especially for technical documents. Writing a long, consistent, well formatted technical report or thesis, with equations, figures, table of contents and bibliography is a very difficult exercise with MS-Word in my experience (although I have done it). In contrast it is much easier with tools like LaTeX. Nowadays LaTeX's output format is PDF.

As a sidenote, MS-Word is a reasonable tool for collaborative documents.

RTF is of course not at all suited to such documents.



[ Reply to This | # ]
Use FileMerge to diff PDF files
Authored by: TonyT on Dec 03, '09 05:54:08PM

PDF to TXT - There's an app for that -- well, actually an automator action.

Open Automator, select Application, Look for PDF in the left hand pane, then Extract PDF Text , drag to the Application Window, then save as an application. Now drop a PDF file on top of the app icon.



[ Reply to This | # ]
Use FileMerge to diff PDF files
Authored by: hut66au on Jun 01, '10 04:14:19AM
Not wanting to sound like an old fogey or anything, but FileMerge cannot hold a candle to Emacs' Ediff compare tool. Start any version of Emacs (but Aquamacs is great with MacOS/X), and select Tools->Ediff->Compare up to 3 buffers or files. Ediff highlights word-for-word differences rather than whole paragraphs and is much much better than FileMerge at ignoring whitespace differences and finding the places in the 2 or 3 buffers that do match up. type 'n' or 'p' to move forth and back between differences, and 'a' or 'b' (or 'c') to select the correct version in the output. It is less pretty than FileMerge but much more effective.

[ Reply to This | # ]
Use FileMerge to diff PDF files - also for PDF?
Authored by: elwood151 on Jan 30, '11 05:56:01AM

Hi,

does ediff in Aquamacs also work with PDF files?

I tried to load to pdf files with the compare (ediff) command, but I don't see how to display the differences and I get an error message saying:


Binary files /var/folders/FX/FXr0KhAz2RWkK++BYv0W1++++TU/-Tmp-/Filename1.pdf and /var/folders/FX/FXr0KhAz2RWkK++BYv0W1++++TU/-Tmp-/Filename2.pdf differ

I never used Emacs or Aquamacs before, so I don't know if I'm doing something wrong?!

used version:
This is GNU Emacs 22.1.50.1 (i386-apple-darwin8.9.1, Carbon Version 1.6.0)
of 2007-06-06 on plume.sr.unh.edu - Aquamacs Distribution 1.0a
On MacOS 10.6.5



[ Reply to This | # ]
Use FileMerge to diff PDF files
Authored by: elwood151 on Jan 30, '11 07:15:18AM

Thanks for this great hint!

However, unfortunately I can't make it work on my computer:

I did how you suggested and the Shell script works (when executed in the terminal), so it is able to convert a pdf to a text file.

However, if I try the opendiff command or try to open 2 pdf files in FileMerge, nothing happens.

Console output:
30.01.11 16:09:01 login[84242] DEAD_PROCESS: 84242 ttys000
30.01.11 16:09:47 [0x0-0x2cc2cc].com.apple.FileMerge[84367] ~/bin/convert_pdf_to_macroman_text.sh: line 2: pdftotext: command not found
30.01.11 16:09:47 [0x0-0x2cc2cc].com.apple.FileMerge[84367] ~/bin/convert_pdf_to_macroman_text.sh: line 2: pdftotext: command not found
30.01.11 16:09:47 FileMerge[84367] *** Assertion failure in -[DiffItem initWithDiffDescriptor:], /SourceCache/FileMerge/FileMerge-1633/DiffItem.m:65
30.01.11 16:09:47 FileMerge[84367] unexpected file diff result at line 0

xpdf and poppler are up to date (acc. to macports, at least)
I'm using Mac OS 10.6.5.

Can anybody help?

Kind regards

Martin



[ Reply to This | # ]
Use FileMerge to diff PDF files
Authored by: svec on Apr 08, '11 01:25:01PM

The "pdftotext" binary has been renamed to "xpdf-pdftotext" in the macports xpdf install.

So try using "xpdf-pdftotext" instead of "pdftotext" in that shell script, that should work.



[ Reply to This | # ]
Use FileMerge to diff PDF files
Authored by: davidswelt on Apr 19, '11 09:04:25AM

Installing xpdf (or poppler) via Macports failed for me (dependencies from hell).

Instead, I simply installed a (universal) binary build of pdftotext that I found online - it was a few years old but did the job just fine, without having to install a crazy amount of additional packages.

---
http://www.reitter-it-media.de // personal: http://www.davids-welt.de



[ Reply to This | # ]