On the Acquisition of Chinese Text

My last few days at work have been filled with the thrills of non-Latin character sets; specifically, getting Chinese and Japanese text into an XML-driven flash movie. It was in interesting trip, so I figured I might as well publish my findings as an exploration.

One caveat: Doing this required the downloading of a lot of language packs, both from Microsoft (for Windows) and from Mozilla (for the Mozilla browser). Visiting any of the following sites will probably bring up “install language pack” dialogues, but since it is not certain that your browser(s) will be able to view the text, I have placed all non-Latin characters in images.

First I went to Babelfish and typed in Ecce Signum .

Hmmm. Apparently Babelfish doesn’t automatically do English-Latin-Chinese translations. Let’s try “behold the evidence”

Closer… let’s try one more: “witness the evidence”.

There we go! Now for the verification, I copied the Chinese text from Babelfish and went to Mandarin Tools ; specifically the Unicode Character Dictionary , and pasted the characters into the text field titled “Search By Character”, while making sure the select box to the left of the text box was set to “UTF-8”.

Apparently it doesn’t translate very well… but is appeared to be close enough. But now the characters were not of a good size to make images of. Printing the screen and resizing the image was, to say the least, not very good. So I whipped up a little web page to display the characters, while making VERY SURE that I saved the page as Unicode, and not ISO-8859-whatever. VERY IMPORTANT for the display of Unicode characters. So now I had this:

Hmmm… little squares where I once had Chinese characters… So I looked at the page in Mozilla anyway. This is what I saw:

Perfect! Now the hard part is over. The rest was simply taking another screen capture, isolating the text in the screen capture, turning the image into a .gif, making it black text on a transparent background, and importing it into Flash to do a Trace Bitmap.

Before-and-after of doing a Trace Bitmap in Flash.

So there it is: my first experience in making Flash play nice with non-Latin characters. I am sure there are many many ways to do this that are much more elegant, and I am sure I will discover them the day after this project launches. But right now, I am kind of happy with what I have.