Submit Hint Search The Forums LinksStatsPollsHeadlinesRSS
14,000 hints and counting!

Transliterate Arabic, Greek, and others using UTF-8 UNIX
I wrote this perl script which transliterates ASCII into UTF-8 colloquial and classical Arabic, Greek, and (at some point in the future) Cyrillic, Hebrew, and other scripts. Input: ASCII. Output: UTF-8 and octal representation of UTF-8.

I've used this to input foreign language titles into my iTunes world music collection. Once you've generated the octal UTF-8, this can be done by hand:

mp4tags -s "`printf '9rabiyuN 'anaa (33027133026133121633025033122033121233121440330243331216331206331216330247'`" -a "Yuri Mrakady" "'ajmal mnw9aat al-jaaz 01.m4a"

The iTunes song title of the AAC file 'ajmal mnw9aat al-jaaz 01.m4a will appear as 9rabiyuN 'anaa (عرَبِيٌ أَنَا). Or enter these codes into a cdrdao TOC file, and use my cd2codec script with the command cd2codec --utf8 to accomplish this automatically. I've written this script for my own needs, but it's easy to modify to incorporate other formats. To use, save the text to the file transliterate, do chmod a+x transliterate, get/build the required tools, and type transliterate --help for usage instructions.

Arabic transliteration is simply a colloquial Arabic front-end for Otakar Smrz's excellent ArabTeX perl script; you'll need to download and install Encode and Encode::Arabic from CPAN. I've also implemented a simple Greek transliteration engine (no accents or breathings) that runs without any additions. I've left placeholders for Cyrillic and Hebrew extensions, but these are not implemented.

[robg adds: I haven't tested this one.]
  • Currently 2.33 / 5
  • 1
  • 2
  • 3
  • 4
  • 5
  (3 votes cast)

Transliterate Arabic, Greek, and others using UTF-8 | 3 comments | Create New Account
Click here to return to the 'Transliterate Arabic, Greek, and others using UTF-8' hint
The following comments are owned by whoever posted them. This site is not responsible for what they say.
Transliterate Arabic, Greek, and others using UTF-8
Authored by: JWiegley on Feb 21, '07 04:33:37PM

I used to use something very similar to this scheme for inputting Persian on my Mac. But then I discovered the utility Ukele, and I just created my own Arabic keyboard. After all, why translate from ASCII to UTF-8 when you can just type in Arabic directly using UTF-8?

[ Reply to This | # ]
Transliterate Arabic, Greek, and others using UTF-8
Authored by: stsmith on Feb 25, '07 12:43:29PM

Yes, for serious editing, you want a keyboard. This is intended for quick access to UTF-8 codes without any high overhead.

[ Reply to This | # ]
Transliterate Arabic, Greek, and others using UTF-8
Authored by: osxpounder on Feb 22, '07 12:09:28PM

Thanks for this hint. I have a few Greek CDs I bought while visiting there, and this might make iTunes show some readable names.

[ Reply to This | # ]