Have you ever had to work with HTML that was generated by the Windows version of Word that had non-ASCII characters sprinkled thoughout -- fancy quotes, etc? Here's an AppleScript that will read text from the clipboard, encode non-ASCII characters as character entities, then put it back on the clipboard. Here's a short description of how it works (also included as comments in the script):
This script will convert high bit WinLatin1 (code page 1252) characters (128-255) to their unicode character entity equivalents, based on this document. Dropping files on the script will edit them in place. If any file does not seem to be a text file, it is skipped. If all files dropped on the script do not look like text files, the user is alerted and the embedded perl script is not run. Just running the script will edit what's on the clipboard.
[robg adds: I tried this, and it works as described -- just make sure your source document is from the Windows version of Word. In my testing, it seems the Mac versions of the special characters aren't the same as those of the Windows version.]
Mac OS X Hints
http://hints.macworld.com/article.php?story=20070504084317913