There's a new command line (unix) utility in OS X 10.4 called textutil which will convert document formats known to the Cocoa layer. According to the man page (man textutil), the formats supported are: txt, html, rtf, rtfd, doc, wordml, or webarchive. Most usefully, you can convert rtf, doc and html documents to text format for further processing with other command line tools.
The basic syntax is:
textutil -convert fmt filename
Where fmt is one of the formats above, and filename is the name of the file you wish to convert. The man page lists many other useful options, including -info to display information about a file. This displays the type, size, length (in characters), and an abstract from the contents of the text file.
Mac OS X Hints
http://hints.macworld.com/article.php?story=20050621010532552