This question about Configuration: Answered

Working with international characters in TinyMCE

The WYSIWYG editor worked fine with the Chinese characters, but when switch to WIKI TEXT mode (or directly start with “Edit wiki text”), all Chinese characters (I guess this happen to any non iso-8859-1 characters) will translate to Unicode (such as #x5206;). It makes editing very difficult(if possible at all). Also, if you save in WIKI TEXT mode, it dose not change back to the original Chinese characters but in raw Unicode that cannot be interpreted back by the browser as Chinese character. Is there a way to change this in TinyMCE?

-- HackatonWong

Hello Hackaton,

Thank you so much for feedback on Chinese language support! I have some questions:
  1. In bin/configure under "Internationalisation", under "Locale":
  2. What is the value of {UseLocale} ?
  3. What is the value of {Site}{Locale} ?
  4. What is the value of {Site}{CharSet} ?
I have uploaded the latest version of TinyMCEPlugin and WysiwygPlugin, these are the versions we will ship with Foswiki 1.1.3 when it is ready, could you please try them?

Also, there is a new option in TinyMCEPlugin, TINYMCEPLUGIN_ENTITY_ENCODING, maybe you can try this:

   * Set TINYMCEPLUGIN_ENTITY_ENCODING = raw

Looking forward to your response, I hope you can provide us with more feedback in future.

-- PaulHarvey - 19 Jan 2011

The setting is all default:

{UseLocale} unchecked
{Site}{Locale} en_US.ISO-8859-1
{Site}{CharSet} blank

After reading your reply, I tried some different combinations and the following worked for me:
{UseLocale} checked
{Site}{Locale} en_US.ISO-8859-1
{Site}{CharSet} utf-8
Now everything worked for me.

As for the new setting TINYMCEPLUGIN_ENTITY_ENCODING, I did try set "entity_encoding" : "raw" directly to TinyMCE, but it make the problem worst; as long as you save the topic, it will show up as raw Unicode.

Will definitely try the new TinyMCEPlugin and WysiwygPlugin

-- HackatonWong - 20 Jan 2011

Hello Hackaton, I'm glad to hear that it worked out okay. ENTITY_ENCODING will make it worse, if the destination charset cannot handle the "native" characters. It might work out okay, if you try UTF8. Or you already tried raw with UTF8?

I'm wondering about your {Site}{Locale}. It really should match the {Site}{Charset}. Eg, use en_US.utf-8

-- PaulHarvey - 20 Jan 2011

Setting {Site}{Charset} to en_US.utf-8 (my current setting) or en_US.ISO-8859-1(default) worked for me.

A new problem with latest TinyMCEPlugin 1.1.7 and WysiwygPlugin 1.1.1

In the same topic, if I use the editor to bold an English word, it translate to * Bold * in "Wiki Text" mode but if I bold a Chinese word it translate to <strong> </strong> instead. On the other hands, if I put two asterisks round an English word in Wiki Text mode, it will show bold in WYSIWYG mode but not Chinese word; the TinyMCEPlugin editor will just show the two asterisks.

-- HackatonWong - 25 Jan 2011

Is this a new bug introduced in TinyMCEPlugin 1.1.7 and WysiwygPlugin 1.1.1?

-- PaulHarvey - 26 Jan 2011

It is not a new bug

It is rather a limitation of TML. Format indicator such as * _ __ = == must leave a space in the front; otherwise, it will be consider as normal character. This rule does not work well with some of major Asia languages.

In English, a word is composed by 26 letters, so space is added in between words so word boundary can be tell apart. It is a Phonogram

For Chinese, every word is a unique character, it is Logogram ,so we don’t need to add space to tell them apart.

So If the following is Chinese sentence, the phase yyyy will not(and should not) be bolded by TML’s syntax

xxxxxx*yyyy*xxxxxx

The only way to do that is use HTML

xxxxxx<strong>yyyy</strong>xxxxxx

-- HackatonWong - 27 Jan 2011

Ah yes, now I know what you mean. Render.pm makes this assumption too. In your experience, have you seen other wiki software do a better job at this? I'm not sure how to change WikiSyntax parsing to modulate these styles on and off properly; the render logic really depends on spaces around words.

I guess I'm a little confused about when you do use spaces - for example to help line-wrapping - but maybe I should try to get my head around typesetting requirements for logographic languages. I have a friend who knows Kanji who might be able to help me.

-- PaulHarvey - 27 Jan 2011

I am also using Trac as Foswiki project. Trac handles this part OK. I don't need to add space.

For space usage, Logographic languages seldom need space, it does not have line wrapping problem like English, since every character is complete by itself. You either have the character completely or have none; you never get a partial character.

Most of our users is happy with the WYSIWYG editor without knowing TML even exist, so it is just an inconvenience.

The biggest concern I have with Foswiki is - It cannot search Chinese word http://foswiki.org/Support/Question776, it makes Wiki less useful.

-- HackatonWong - 28 Jan 2011

QuestionForm edit

Subject Configuration
Extension TinyMCEPlugin
Version Foswiki 1.1.2
Status Answered
Topic revision: r11 - 28 Jan 2011, HackatonWong
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy