You are here: Foswiki>Tasks Web>Item13331 (03 Jan 2016, JozefMojzis)Edit Attach

Item13331: Wrong encoding in "View WikiText" when utf8

pencil
Priority: Urgent
Current State: Closed
Released In: 2.0.0
Target Release: major
Applies To: Engine
Component: I18N
Branches: Item13331 master
Reported By: JozefMojzis
Waiting For:
Last Change By: JozefMojzis

Wrong encoding in "View WikiText" when utf8

How to reporoduce

Screenshots

  • normalview.png:
    normalview.png

  • rawview.png:
    rawview.png

  • View source screenshot - sourceview.png:
    sourceview.png

-- JozefMojzis - 26 Mar 2015

I can't reproduce this. There is a meta charset="utf-8" tag when I try this, and the utf8 text all displays correctly.

-- GeorgeClark - 27 Mar 2015

This error is partially solved by the other patches. Anyway, the problem remains for one character - the czech U+011B LATIN SMALL LETTER E WITH CARON. Attaching a file with few czech words, for testing it. Simply download the file, move it into some web and display it. The normal view is correct. Click "View WikiText" and all small e-caron characters are screwed.

-- JozefMojzis - 16 Apr 2015

The following patch appears to fix it. CGI::charset() reports that it is still using iso-8859-1

diff --git a/core/lib/Foswiki/UI/View.pm b/core/lib/Foswiki/UI/View.pm
index 08295ae..fc39cd3 100644
--- a/core/lib/Foswiki/UI/View.pm
+++ b/core/lib/Foswiki/UI/View.pm
@@ -486,6 +486,7 @@ sub view {
         if ($raw) {
             if ($text) {
                 my $p = $session->{prefs};
+                CGI::charset($Foswiki::cfg{Site}{CharSet});
                 $page .= CGI::textarea(
                     -readonly => 'readonly',
                     -rows     => $p->getPreference('EDITBOXHEIGHT'),

-- GeorgeClark - 16 Apr 2015

Turns out the fix works for CGI 3.65, but fails on CGI 4.14.

-- GeorgeClark - 16 Apr 2015

Manually hacking together a textarea works fine:

diff --git a/core/lib/Foswiki/UI/View.pm b/core/lib/Foswiki/UI/View.pm
index 08295ae..7651d8b 100644
--- a/core/lib/Foswiki/UI/View.pm
+++ b/core/lib/Foswiki/UI/View.pm
@@ -486,15 +486,10 @@ sub view {
         if ($raw) {
             if ($text) {
                 my $p = $session->{prefs};
-                $page .= CGI::textarea(
-                    -readonly => 'readonly',
-                    -rows     => $p->getPreference('EDITBOXHEIGHT'),
-                    -cols     => $p->getPreference('EDITBOXWIDTH'),
-                    -style    => $p->getPreference('EDITBOXSTYLE'),
-                    -class    => 'foswikiTextarea foswikiTextareaRawView',
-                    -id       => 'topic',
-                    -default  => $text
-                );
+                my $ta = '<textarea name=""  rows="22" cols="70" charset="utf-8" class="foswikiTextarea foswikiTextareaRawView" id="topic" readonly="readonly" style="width:99%">';
+                $ta .= $text;
+                $ta .= '</textarea>';
+                $page .= $ta;
             }
         }
         else {

-- GeorgeClark - 16 Apr 2015

I've created an Item13331 branch to experiment on fixing this. Implemented a new Foswiki::Render::HTML::textarea() static method to replace CGI::textarea.

It appears to fix the textarea encoding issues.

-- GeorgeClark - 17 Apr 2015

Josef, which version of CGI are you using? Note that 4.13 is busted. A fixed 4.14 is already available.

-- MichaelDaum - 17 Apr 2015

Michael - I'm using 4.14. See the irc-log about this issue.

-- JozefMojzis - 17 Apr 2015

George - The Item13331 branch is fantastic. Now finally works the form's textareas an the Wikitext view too - with all utf8 characters. (tested locally with a same topic as: http://trunk.foswiki.org/Sandbox/Jomo/JozefMojzisTestTopic

-- JozefMojzis - 17 Apr 2015

Note that Foswiki::UI::View should use templates the same way as Foswiki::UI::Edit does to create the textareas properly. Here's an alternative fix which imho is better as it removes HTML from perl into the templating arena where it belongs: raw.patch

-- MichaelDaum - 17 Apr 2015

Yes, the template based solution was the George's 1st idea (see the irc-log). Of course, the template-solution is much cleaner, but "could" break some (user developed) skins whose didn't includes the default "view.tmpl" - e.g. the view.myskin.tmpl could NOT contain TMPL:INCLUDE{"view"}. For example: NatSkin includes it? smile

Also, need add the "template based patch" to the "Form/Textarea.pm". (see: http://foswiki.org/pub/Tasks/Item13331/raw.patch).

-- JozefMojzis - 17 Apr 2015

Michael, thanks for the template patch. I'm wondering if the way to do this is to use your code in the Render::HTML code I added so that it's used in view, form, and also is needed in Preferences. Those were the three places I found calls to CGI::textarea

-- GeorgeClark - 17 Apr 2015

Josef, true.

-- MichaelDaum - 17 Apr 2015

Alternative fix:

diff --git a/core/lib/Foswiki/Form/Textarea.pm b/core/lib/Foswiki/Form/Textarea.pm
index 7c04c80..f516f42 100644
--- a/core/lib/Foswiki/Form/Textarea.pm
+++ b/core/lib/Foswiki/Form/Textarea.pm
@@ -4,6 +4,9 @@ package Foswiki::Form::Textarea;
 use strict;
 use warnings;
 
+use CGI ();
+$CGI::ENCODE_ENTITIES     = q{&<>"'};
+
 use Foswiki::Form::FieldDefinition ();
 our @ISA = ('Foswiki::Form::FieldDefinition');

It is the two extra characters that Lee left in in the default setting for CGI::ENCODE_ENTITIES that break at random unicode byte strings.

-- MichaelDaum - 20 Apr 2015

Forwarded this finding to https://github.com/leejo/CGI.pm/issues/157

-- MichaelDaum - 20 Apr 2015

Please test on the master branch now.

-- MichaelDaum - 20 Apr 2015

jomo, please change the status of this to "Waiting for Release" if the encoding issues with e-Caron and other characters are resolved. We are deferring the conversion to template based HTML to post 1.2.0. The change is too major for 1.2, and needs a feature proposal. Several deficiencies in the item branch implementation have been identified.

-- GeorgeClark - 20 Apr 2015

hm... the latest master brings back the textfield and textarea problems. Attached a file with a Form definition and one Topic file with the form..

-- JozefMojzis - 20 Apr 2015

After the discussion on irc with GeorgeClark - the last error in completely new issue - so setting this as Waiting for Release"...

-- JozefMojzis - 20 Apr 2015
 

I Attachment Action Size Date Who Comment
CzechE.txttxt CzechE.txt manage 7 K 16 Apr 2015 - 20:50 JozefMojzis file with czech e-caron characters
cbox.tgztgz cbox.tgz manage 976 bytes 19 Apr 2015 - 15:10 JozefMojzis checkbox test form
formtest.tgztgz formtest.tgz manage 1 K 20 Apr 2015 - 20:48 JozefMojzis formtest with accented chars
normalview.pngpng normalview.png manage 53 K 27 Mar 2015 - 01:02 JozefMojzis  
raw.patchpatch raw.patch manage 1 K 17 Apr 2015 - 11:11 MichaelDaum  
rawview.pngpng rawview.png manage 53 K 27 Mar 2015 - 01:03 JozefMojzis  
sourceview.pngpng sourceview.png manage 180 K 27 Mar 2015 - 01:10 JozefMojzis  
Topic revision: r31 - 03 Jan 2016, JozefMojzis
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy