Submit Hint Search The Forums LinksStatsPollsHeadlinesRSS
14,000 hints and counting!


Click here to return to the 'Ghostscript can convert PDF to text' hint
The following comments are owned by whoever posted them. This site is not responsible for what they say.
Ghostscript can convert PDF to text
Authored by: TrumpetPower! on Mar 14, '06 09:34:29AM

Ghostscript can convert PDF files to plain text, though you might not be terribly happy with the results. That's not Ghostscript's fault, though--it depends entirely on the nature of the particular PDF in question. For example, if the text was converted to paths before being outputted as PDF, you won't get anything. Often, kerning is done by starting a new block of text at that point, which can r esul t in w eir d gap s in t he t e x t. And so on.

Your best bet may be the full version of Acrobat (not the reader), since it includes OCR and other niceties. But, unless the PDFs were specifically created in a manner to keep the text machine- as well as human-readable (for speakable text, for example), don't plan on it being a fully-automated process.

Cheers,

b&



[ Reply to This | # ]