Item10631: RTFContrib and non-english languages

pencil
Priority: Normal
Current State: New
Released In: n/a
Target Release: n/a
Applies To: Extension
Component: RtfContrib
Branches:
Reported By: StefanosKouzof
Waiting For: MichaelDaum
Last Change By: StefanosKouzof
The RTFContrib plugin when given wiki pages that contain non-english letters, just copies them to the created rtf document, but the document cannot show them because they are not escape sequences. I installed the plugin, created the simplest .rtf template file (just written %TEXT% inside it), used a topic with one word in English, one word in Greek and the html link, and the output was (the editor transformed them to HTML entities automatically - see end of my post for the original escape sequence) :
testing Δοκιμήhttp://qpmdemo.ergoq.local:8080/bin/rtf/Sandbox/TestTopic1?template=Sandbox.TestTopic1.qpm.rtf&filename=pr01.rtf

The contents of the created .rtf file, are:
{\rtf1\ansi\deff0\adeflang1025
{\fonttbl{\f0\froman\fprq2\fcharset161 Times New Roman;}{\f1\froman\fprq2\fcharset161 Times New Roman;}{\f2\fswiss\fprq2\fcharset161 Arial;}{\f3\fnil\fprq2\fcharset161 Arial Unicode MS;}{\f4\fnil\fprq2\fcharset161 Mangal;}{\f5\fnil\fprq0\fcharset161 Mangal;}}
{\colortbl;\red0\green0\blue0;\red128\green128\blue128;}
{\stylesheet{\s1\cf0{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\rtlch\af4\afs24\lang1081\ltrch\dbch\af3\langfe2052\hich\f0\fs24\lang1032\loch\f0\fs24\lang1032\snext1 Normal;}
{\s2\sb240\sa120\keepn\cf0{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\rtlch\afs28\lang1081\ltrch\dbch\langfe2052\hich\f2\fs28\lang1032\loch\f2\fs28\lang1032\sbasedon1\snext3 Heading;}
{\s3\sa120\cf0{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\rtlch\af4\afs24\lang1081\ltrch\dbch\af3\langfe2052\hich\f0\fs24\lang1032\loch\f0\fs24\lang1032\sbasedon1\snext3 Body Text;}
{\s4\sa120\cf0{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\rtlch\af5\afs24\lang1081\ltrch\dbch\af3\langfe2052\hich\f0\fs24\lang1032\loch\f0\fs24\lang1032\sbasedon3\snext4 List;}
{\s5\sb120\sa120\cf0{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\rtlch\af5\afs24\lang1081\ai\ltrch\dbch\af3\langfe2052\hich\f0\fs24\lang1032\i\loch\f0\fs24\lang1032\i\sbasedon1\snext5 caption;}
{\s6\cf0{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\rtlch\af5\afs24\lang1081\ltrch\dbch\af3\langfe2052\hich\f0\fs24\lang1032\loch\f0\fs24\lang1032\sbasedon1\snext6 Index;}
}
{\info{\upr{\author ???????? ??????}{\*\ud{\author \u931\'3f\u964\'3f\u941\'3f\u966\'3f\u945\'3f\u957\'3f\u959\'3f\u962\'3f \u922\'3f\u959\'3f\u965\'3f\u950\'3f\u974\'3f\u966\'3f}}}{\creatim\yr2011\mo4\dy12\hr17\min20}{\revtim\yr0\mo0\dy0\hr0\min0}{\printim\yr0\mo0\dy0\hr0\min0}{\comment StarWriter}{\vern3300}}\deftab709
{\*\pgdsctbl
{\pgdsc0\pgdscuse195\pgwsxn11906\pghsxn16838\marglsxn1134\margrsxn1134\margtsxn1134\margbsxn1134\pgdscnxt0 Standard;}}
\paperh16838\paperw11906\margl1134\margr1134\margt1134\margb1134\sectd\sbknone\pgwsxn11906\pghsxn16838\marglsxn1134\margrsxn1134\margtsxn1134\margbsxn1134\ftnbj\ftnstart1\ftnrstcont\ftnnar\aenddoc\aftnrstcont\aftnstart1\aftnnrlc
\pard\plain \ltrpar\s1\cf0{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\rtlch\af4\afs24\lang1081\ltrch\dbch\af3\langfe2052\hich\f0\fs24\lang1033\loch\f0\fs24\lang1033 {\rtlch \ltrch\loch\f0\fs24\lang1033\i0\b0 testing 

Δοκιμή

http://qpmdemo.ergoq.local:8080/bin/rtf/Sandbox/TestTopic1?template=Sandbox.TestTopic1.qpm.rtf&filename=pr01.rtf
}
\par }

It is obvious that the greek word was not encoded. I created the same file with a word processor (OpenOffice 3.2), and saved it in .rtf format. The output is:
{\rtf1\ansi\deff0\adeflang1025
{\fonttbl{\f0\froman\fprq2\fcharset161 Times New Roman;}{\f1\froman\fprq2\fcharset161 Times New Roman;}{\f2\fswiss\fprq2\fcharset161 Arial;}{\f3\fnil\fprq2\fcharset161 Arial Unicode MS;}{\f4\fnil\fprq2\fcharset161 Mangal;}{\f5\fnil\fprq0\fcharset161 Mangal;}}
{\colortbl;\red0\green0\blue0;\red128\green128\blue128;}
{\stylesheet{\s1\cf0{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\rtlch\af4\afs24\lang1081\ltrch\dbch\af3\langfe2052\hich\f0\fs24\lang1032\loch\f0\fs24\lang1032\snext1 Normal;}
{\s2\sb240\sa120\keepn\cf0{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\rtlch\afs28\lang1081\ltrch\dbch\langfe2052\hich\f2\fs28\lang1032\loch\f2\fs28\lang1032\sbasedon1\snext3 Heading;}
{\s3\sa120\cf0{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\rtlch\af4\afs24\lang1081\ltrch\dbch\af3\langfe2052\hich\f0\fs24\lang1032\loch\f0\fs24\lang1032\sbasedon1\snext3 Body Text;}
{\s4\sa120\cf0{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\rtlch\af5\afs24\lang1081\ltrch\dbch\af3\langfe2052\hich\f0\fs24\lang1032\loch\f0\fs24\lang1032\sbasedon3\snext4 List;}
{\s5\sb120\sa120\cf0{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\rtlch\af5\afs24\lang1081\ai\ltrch\dbch\af3\langfe2052\hich\f0\fs24\lang1032\i\loch\f0\fs24\lang1032\i\sbasedon1\snext5 caption;}
{\s6\cf0{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\rtlch\af5\afs24\lang1081\ltrch\dbch\af3\langfe2052\hich\f0\fs24\lang1032\loch\f0\fs24\lang1032\sbasedon1\snext6 Index;}
}
{\info{\upr{\author ???????? ??????}{\*\ud{\author \u931\'3f\u964\'3f\u941\'3f\u966\'3f\u945\'3f\u957\'3f\u959\'3f\u962\'3f \u922\'3f\u959\'3f\u965\'3f\u950\'3f\u974\'3f\u966\'3f}}}{\creatim\yr2011\mo4\dy12\hr17\min26}{\revtim\yr0\mo0\dy0\hr0\min0}{\printim\yr0\mo0\dy0\hr0\min0}{\comment StarWriter}{\vern3300}}\deftab709
{\*\pgdsctbl
{\pgdsc0\pgdscuse195\pgwsxn11906\pghsxn16838\marglsxn1134\margrsxn1134\margtsxn1134\margbsxn1134\pgdscnxt0 Standard;}}
\paperh16838\paperw11906\margl1134\margr1134\margt1134\margb1134\sectd\sbknone\pgwsxn11906\pghsxn16838\marglsxn1134\margrsxn1134\margtsxn1134\margbsxn1134\ftnbj\ftnstart1\ftnrstcont\ftnnar\aenddoc\aftnrstcont\aftnstart1\aftnnrlc
\pard\plain \ltrpar\s1\cf0{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\rtlch\af4\afs24\lang1081\ltrch\dbch\af3\langfe2052\hich\f0\fs24\lang1033\loch\f0\fs24\lang1033 {\rtlch \ltrch\loch\f0\fs24\lang1033\i0\b0 testing \line \line \'c4\'ef\'ea\'e9\'ec\'de\line \line http://qpmdemo.ergoq.local:8080/bin/rtf/Sandbox/TestTopic1?template=Sandbox.TestTopic1.qpm.rtf&filename=pr01.rtf}
\par }

The clue is in the second line from the end, just before the hyperlink: The word in greek is the escape sequence:
\'c4\'ef\'ea\'e9\'ec\'de

Is there a way for non-english language content to be UTF-encoded?

-- StefanosKouzof - 12 Apr 2011

 

ItemTemplate edit

Summary RTFContrib and non-english languages
ReportedBy StefanosKouzof
Codebase
SVN Range
AppliesTo Extension
Component RtfContrib
Priority Normal
CurrentState New
WaitingFor MichaelDaum
Checkins
TargetRelease n/a
ReleasedIn n/a
Topic revision: r1 - 12 Apr 2011, StefanosKouzof
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy