Submit Hint Search The Forums LinksStatsPollsHeadlinesRSS
14,000 hints and counting!

Bulk convert text files to PDF System
Here is a quick AppleScript droplet to bulk convert text files (plain text, rich text, html, some code files like C and javascript files...) to PDF format.

Copy the following script into the AppleScript Editor (/Applications/Utilities/AppleScript Editor) and save it as an application.

Then drag-and-drop files onto it to convert them to PDFs.
on open theFiles
  set oldTID to AppleScript's text item delimiters
  repeat with thisFile in theFiles
    -- get file path as posix path
    set inputFilePath to POSIX path of thisFile
    
    -- create output path - same name with .pdf extension
    set AppleScript's text item delimiters to "."
    set outputFilePathBits to text items of inputFilePath
    set last text item of outputFilePathBits to "pdf"
    set outputFilePath to outputFilePathBits as text
    
    -- create convert command and send to shell
    set AppleScript's text item delimiters to " "
    set cmdList to {"/System/Library/Printers/Libraries/convert", "-f", quoted form of inputFilePath, "-o", quoted form of outputFilePath}
    do shell script (cmdList as text)
  end repeat
  set AppleScript's text item delimiters to oldTID
end open
The new PDF files will be put in the same folder as the originals.

The convert utility is not well documented (enter /System/Library/Printers/Libraries/convert without options in Terminal to see the brief help menu), but it's really just a front-end for the cupsfilter utility, which has better documentation. If you're a CUPS expert you might be able to do more with it than this, but for simple conversions this does nicely.

[crarko adds: I tested this, and it works as described.]
    •    
  • Currently 2.67 / 5
  You rated: 3 / 5 (6 votes cast)
 
[14,682 views]  

Bulk convert text files to PDF | 17 comments | Create New Account
Click here to return to the 'Bulk convert text files to PDF' hint
The following comments are owned by whoever posted them. This site is not responsible for what they say.
Bulk convert text files to PDF
Authored by: TXNole on Jun 06, '11 09:11:28AM

Will this work on Word documents? I have been looking for a way that I could automate the conversion of Word documents into PDFs with the "save as PDF" print screen.



[ Reply to This | # ]
Bulk convert text files to PDF
Authored by: tedw on Jun 06, '11 10:31:18AM
No. Well, possibly if the word documents are in rtf or text format, but not .doc files. there have been several hints about converting word documents to PDF (search the site), but the general problem seems to be that Word format is proprietary so Apple didn't include a conversion algorithm in CUPS (which is free under GNU license). the file types natively supported are listed here.

It doesn't seem impossible that someone could develop such support as a third party - the file at /usr/share/cups/mime/mime.convs seems to imply that local conversion configurations can be created, at any rate - but they'd still have to work out the proprietary software issue.

[ Reply to This | # ]

Bulk convert text files to PDF
Authored by: musselrock on Jun 06, '11 12:50:53PM

I forgot about textutil which is application independent. Search this forum for "textutil" looking under OS X 10.4. Lots of good hints.



[ Reply to This | # ]
Bulk convert text files to PDF
Authored by: tedw on Jun 06, '11 01:56:52PM
well, I'd forgotten about textutil as well. You could convert doc and docx files to pdf indirectly by first converting them to rtf format, with the following modifications:
on open theFiles
	set oldTID to AppleScript's text item delimiters
	repeat with thisFile in theFiles
		-- get file path as posix path
		set inputFilePath to POSIX path of thisFile
		
		set AppleScript's text item delimiters to "."
		-- check for doc & docx, convert them to rtf, reset inputFilePath to point to converted file
		if last text item of inputFilePath is in {"doc", "docx"} then
			do shell script "textutil -convert rtf " & quoted form of inputFilePath
			set inputFilePath to ((text items 1 thru -2 of inputFilePath) & "rtf") as text
		end if
		
		-- create output path - same name with .pdf extension
		set outputFilePathBits to text items of inputFilePath
		set last text item of outputFilePathBits to "pdf"
		set outputFilePath to outputFilePathBits as text
		
		-- create convert command and send to shell
		set AppleScript's text item delimiters to " "
		set cmdList to {"/System/Library/Printers/Libraries/convert", "-f", quoted form of inputFilePath, "-o", quoted form of outputFilePath}
		do shell script (cmdList as text)
	end repeat
	set AppleScript's text item delimiters to oldTID
end open
That might lose or mess up some formatting - not everything in a doc file can be represented properly in rtf - but it might do for quick and dirty conversions. Probably it should clean up the intermediary rtf files, too, but...

[ Reply to This | # ]
Bulk convert text files to PDF
Authored by: musselrock on Jun 06, '11 12:07:42PM

For Word documents, either of these scripts, saved as applications, will batch Word files to PDF, The first uses Pages which I like to use the best as it preserves the hyperlinks in Word docs. The second script does it with Word (2008), but hyperlinks are lost.

First script:
(* Drag and drop batch conversion of Word docs or any document that can be opened by the Pages application to PDF files. *)

on open filist
tell application "Finder"
set defLoc to container of (item 1 of filist) as alias
end tell
set destn to choose folder with prompt "You are about to copy files to PDFs using Pages." & return & "Select a location to save the converted files." default location defLoc
repeat with lvar in filist
tell application "Finder"
set thename to name of lvar
set filex to name extension of lvar
set l to length of filex
end tell
set nuname to text 1 thru text item -(l + 1) of thename
tell application "Pages"
launch
set filname to (destn as string) & nuname & "pdf"
open lvar
save document 1 in filname
close every window saving no
end tell
end repeat
end open

Second script:
(*Batch convert Word files to PDF using Word 2008*)
on open fillist
set bs to "/"
set d to "-"
tell application "Finder"
set defLoc to container of (item 1 of fillist) as alias
end tell
set destn to choose folder with prompt "You are converting Word files by copying them to PDF files." & return & "Select a location to save converted files." default location defLoc
repeat with lvar in fillist
tell application "Finder"
set thename to name of lvar
set filex to name extension of lvar
set l to length of filex
end tell
set nuname to text 1 thru text item -(l + 1) of thename
if "/" is in nuname then
set oldDelims to AppleScript's text item delimiters
set AppleScript's text item delimiters to bs
set txtLst to every text item of nuname as list
set AppleScript's text item delimiters to d
set nuname to txtLst as string
set AppleScript's text item delimiters to oldDelims
end if
set filname to (destn as string) & nuname & "pdf"
tell application "Microsoft Word"
launch
open lvar
save as active document file name filname file format format PDF
close window 1 saving no
end tell
end repeat
end open



[ Reply to This | # ]
Bulk convert text files to PDF
Authored by: musselrock on Jun 06, '11 11:55:23AM

Very good hint and it is application independent. It even preserves hyperlinks in files (something that Office 2008 won't do when converting to pdf.



[ Reply to This | # ]
Bulk convert text files to PDF
Authored by: wallybear on Jun 06, '11 02:22:22PM
why bother with AppleScript text delimiters for the "do shell script" command? You could easily replace the lines:
    set AppleScript's text item delimiters to " "
    set cmdList to {"/System/Library/Printers/Libraries/convert", "-f", quoted form of inputFilePath, "-o", quoted form of outputFilePath}
    do shell script (cmdList as text)
with this single line:
    do shell script "/System/Library/Printers/Libraries/convert -f " & quoted form of inputFilePath & " -o " & quoted form of outputFilePath


[ Reply to This | # ]
Bulk convert text files to PDF
Authored by: tedw on Jun 06, '11 05:16:48PM

no real reason - mostly because I was using text items already. :-)



[ Reply to This | # ]
Bulk convert text files to PDF
Authored by: tingo on Jun 07, '11 06:57:53AM

Doesn't seem to work at all for old text files from the days when Macs didn't use extensions (SimpleText, etc.). I get this message: "convert: Unable to determine MIME type of '[path/file)]'"



[ Reply to This | # ]
Bulk convert text files to PDF
Authored by: tedw on Jun 07, '11 07:10:22PM
if you have files that the system doesn't recognize (no extension, and no Finder data), you can modify the script to assign the correct mime type. for SimpleText files, for instance, you'd want to add the option -i text/plain to the convert command - that will force the utility to assume that the input files are plain-text files. You could also solve the problem by writing a separate script to run through the files and add .txt extensions, but if you just want to convert them to pdf the -i option would be quicker.

[ Reply to This | # ]
Bulk convert text files to PDF
Authored by: tingo on Jun 08, '11 12:02:11AM

Thank you, very much appreciated. Unfortunately, I have no understanding of programming scripts, so I'm a bit stuck there.

Later, CET 09:25 · I only saw the later posts after reading your reply and responding to it. The last script, including your amendment, takes care of some of my text files, but not all. But I wouldn't know where to add your string.

Word files: those with embedded tables or graphics give an error message, and a pdf file Adobe Reader won't open. But OK, I guess it's maybe ascking too much, and as converting Word files is not really something I need...

Edited on Jun 08, '11 12:26:27AM by tingo



[ Reply to This | # ]
Bulk convert text files to PDF
Authored by: lsequeir on Jun 07, '11 07:46:08AM

Thank you! This is a great hint for anyone who needs to create pdf versions of many files.

Perhaps even handier in this situation (at least in Snow Leopard) may be to have this set as an Automator service.
This has two main advantages that I can think of: a) it is always readily available in the contextual menu, no need to locate an application to drag your file to; and b), the code for the service is actually shorter and simpler - for instance, there is no need to deal with text item delimiters at all.

Just create an Automator service, set to accept files and folders from the Finder; add just one action: "Run Shell Script" (it under "Utilities"), and set "pass input" as "as arguments". The code will be almost all written for you already. Just replace the placeholder "echo ..." line with the appropriate line.

If you prefer bash as your shell, you can use the following as the shell script (all but the second line will have been put there automatically):

for f in "$@"
do
/System/Library/Printers/Libraries/convert -f "$1" -o "${1:r}.pdf"
done


Or, If you prefer tcsh, you can use the following as the shell script (again, all but the second line will be there already):

while ( $# )
/System/Library/Printers/Libraries/convert -f "$1" -o "${1:r}.pdf"
shift
end

---
LuĂ­s



[ Reply to This | # ]
Bulk convert text files to PDF
Authored by: wallybear on Jun 07, '11 09:19:58AM
Here is a more robust version of the script:
  • a minimal error checking
  • conversion of doc/docx documents to rtf then to pdf, removing the temporary file
  • prevents creation of incorrect destination paths: in the original version if we had a file
    like "/Volumes/foobar/Links 1.0/Readme" (file with no extension and a folder name with a dot)
    the resulting pdf would have been "/Volumes/foobar/Links 1.pdf"
  • prevent creation of stray dot files: in the original version if we had a file like "/Volumes/foobar/Links/.Readme"
    (file beginning with a dot) the resulting pdf would have
    been "/Volumes/foobar/Links/.pdf" instead of "/Volumes/foobar/Links/.Readme.pdf"
  • no use of text delimiters.

The code:
on open theFiles
	repeat with thisFile in theFiles
		-- get file path as posix path
		set inputFilePath to POSIX path of thisFile
		set tempfile to ""
		-- convert doc & docx files to RTF
		if inputFilePath ends with ".doc" or inputFilePath ends with ".docx" then
			set tempfile to changeExtension(inputFilePath, ".rtf")
			try
				do shell script "textutil -convert rtf " & quoted form of inputFilePath & " -output " & quoted form of tempfile
				set inputFilePath to tempfile
			on error theError
				display dialog theError buttons "OK" with title "Error converting doc/docx" with icon stop
				exit repeat
			end try
		end if
		-- create convert command and send to shell, output path is the same name with .pdf extension
		try
			do shell script "/System/Library/Printers/Libraries/convert -f " & quoted form of inputFilePath & " -o " & quoted form of changeExtension(inputFilePath, ".pdf")
			--get rid of temp file if present
			if tempfile is not "" then tell application "Finder" to move (tempfile as POSIX file) to trash
		on error theError
			display dialog theError buttons "OK" with title "Error" with icon stop
		end try
	end repeat
end open

on changeExtension(myText, myExt)
	set dotPosition to lastPos(myText, ".")
	-- prevent renaming of folders and dot files (.name) by mistake
	if dotPosition > lastPos(myText, "/") + 1 then
		return ((text items 1 thru lastPos(myText, ".") of myText) as text) & myExt
	else
		return myText & myExt
	end if
end changeExtension

on lastPos(myText, thechar)
	return (length of myText) - (offset of thechar in ((reverse of text items of myText) as string))
end lastPos


[ Reply to This | # ]
Bulk convert text files to PDF
Authored by: tedw on Jun 07, '11 07:38:19PM
I'm not sure what your objection to text items is. for instance, you could replace your last two subroutines with the following code:
on changeExtension(myText, myExt)
	set {oldTID, my text item delimiters} to {my text item delimiters, "."}
	if last text item of myText is in {"txt", "rtf", "doc", "docx", "htm", "html", "js", "h", "c"} then
		set output to (text items 1 thru -2 of myText) & myExt as text
	else
		set output to {myText, myExt} as text
	end if
	set my text item delimiters to oldTID
	return output
end changeExtension
which seems clearer and simpler to me than the whole 'length-offset' system you've used, and is likely faster (text items are very quick). You might need to tweak the list of valid extensions out if you want to use this on other file types, of course.

[ Reply to This | # ]
Bulk convert text files to PDF
Authored by: wallybear on Jun 08, '11 06:46:05AM
Nice solution.
Oh, I have nothing against your code, only I don't like tampering with the text item delimiters (a strange form of idiosyncrasy, I hope that a cure exists). For text manipulation (and other stuff) I often make direct calls to Cocoa NSString functions.

My first rewrite of your script was much more short (7 lines for the main procedure and the 3 lines of the lastPos function), but then I noticed the possibilty of errors so I added a little more control on the script, that's why I had to add the changeExtension function and more lines in the main script.

Note that your changeExtension suggestion does not change the extension in every situation: if the extension is in the list you supply it will be changed, otherwise the new extension will be appended without removing the previous one.
That solves cases with a file with no extension (and the consequential risk of a folder in the path containing a dot), but you have to list all possible accepted extensions for the script, and the list is really long: the convert tool can convert to PDF a lot of file types (e.g. TIFF, GIF, PSD and more graphic formats).

BTW for text manipulation I often make direct calls to Cocoa functions
Edited on Jun 08, '11 06:48:19AM by wallybear


[ Reply to This | # ]
Bulk convert text files to PDF
Authored by: tedw on Jun 08, '11 07:43:01AM
Sorry, there's no cure for that, and I'm pretty sure it's fatal. :-)

'shortness' is only one factor here clarity is also important. You identify the extension using (essentially):

(length of myText) - (offset of "." in ((reverse of text items of myText) as string))
I use:
set my text item delimiters "."
last text item of myText
If you like using cocoa commands for text manipulation, then you're stuck in XCode, though personally for any complex text manips I use the Satimage osax (which has all sorts of text goodies wrapped up in it). text items take some getting used to, but they really simplify some kinds of text operations.

with respect to your other point, you're right, but that's somewhat out of the scope of this script. all this script (unmodified) can do with images is create a pdf for each one fed to it, which isn't all that useful - if you want to create multi-image pdfs then you'd want a different script (and even then, you'd probably have more luck making pdfs straight from preview). Can't do everything...

[ Reply to This | # ]

Bulk convert text files to PDF
Authored by: wallybear on Jun 08, '11 10:16:36AM

Too bad for the cure... :(

You're right, the text delimiters method is clearer; regarding Satimage osaxes they are really interesting, but long ago I promised myself to keep clear from osaxes (I got burned when some osaxes I used in my software stopped working with the next OS X release and found to be no more mantained); what's more, I should force my users to install them.

Ok for the (un)usefulness of converting images, but it was only an example; convert can process lots of different file kinds, and I don't know "a priori" which one I have to put in that list. Why should I limit a functionality given for free?

Regarding the deletion of the temp file when converting from doc/docx, I just found that the "-D" option of convert deletes the input file when finished, so it could be used as an alternative method.



[ Reply to This | # ]