Submit Hint Search The Forums LinksStatsPollsHeadlinesRSS
14,000 hints and counting!

A script to encode Windows Word's non-ASCII characters Apps
Have you ever had to work with HTML that was generated by the Windows version of Word that had non-ASCII characters sprinkled thoughout -- fancy quotes, etc? Here's an AppleScript that will read text from the clipboard, encode non-ASCII characters as character entities, then put it back on the clipboard. Here's a short description of how it works (also included as comments in the script):

This script will convert high bit WinLatin1 (code page 1252) characters (128-255) to their unicode character entity equivalents, based on this document. Dropping files on the script will edit them in place. If any file does not seem to be a text file, it is skipped. If all files dropped on the script do not look like text files, the user is alerted and the embedded perl script is not run. Just running the script will edit what's on the clipboard.

[robg adds: I tried this, and it works as described -- just make sure your source document is from the Windows version of Word. In my testing, it seems the Mac versions of the special characters aren't the same as those of the Windows version.]
    •    
  • Currently 2.60 / 5
  You rated: 5 / 5 (5 votes cast)
 
[7,732 views]  

A script to encode Windows Word's non-ASCII characters | 2 comments | Create New Account
Click here to return to the 'A script to encode Windows Word's non-ASCII characters' hint
The following comments are owned by whoever posted them. This site is not responsible for what they say.
A script to encode Windows Word's non-ASCII characters
Authored by: boxcarl on May 14, '07 12:04:50PM
Or you could just use UnicodeChecker, which is free and can do HTML encoding of non-ASCII characters through copy and paste or the Services menu.

[ Reply to This | # ]
A script to encode Windows Word's non-ASCII characters
Authored by: Fairly on May 21, '07 02:49:08PM
Or this.
http://rixstep.com/acp/service
Or this.
http://fourmilab.ch/webtools/demoroniser/

But this UnicodeChecker looks great!

[ Reply to This | # ]