Submit Hint Search The Forums LinksStatsPollsHeadlinesRSS
14,000 hints and counting!


Click here to return to the 'Non-ASCII Characters' hint
The following comments are owned by whoever posted them. This site is not responsible for what they say.
Non-ASCII Characters
Authored by: powerbookg3user0 on Oct 06, '04 07:49:47PM
This won't work with non-ASCII characters (in this case Japanese):

file://localhost/Users/tmurayama/Music/iTunes/
iTunes Music/Unknown Artist/Unknown Album/
%E9%95%B7%E9%87%8E%E7%9C%8C%E3%81%AE%E6%AD%8C.wav

This was all one line, I just broke it up to save trouble for people with small screens.

---
Takumi Murayama

[ Reply to This | # ]

Non-ASCII Characters
Authored by: taxi on Oct 07, '04 05:19:48AM

The problem you are describing is due to the handling of extended characters for URL (also called URL Encoding).

If you were going to use the list for other purposes, such as acting on each file with a shell script, or something else, then you might need to convert the extended characters from Hexadecimal (%20 is space, for instance) to octal.

To see what I mean, examine Beyoncé. This appears as 'Beyonce%CC%81' in URL encoding, but bash requires it be something more like 'Beyonce\314\201'.

Using python it is easy to see that 0xCC (hex CC) is equal to 204 (decimal), as is 0314 (octal 314).

What I do find interesting is that typing in Beyonc (alt-e e) gives the result 'Beyonc\303\251' (note the missing second e). Apparently there are two ways to describe an extended character - using the ASCII base character, and not.

Reminds me of the OT quote:

"There are two types of people in the world. Those who divide other people into two groups, and those who do not."

GRIN.



[ Reply to This | # ]