Item13338: UTF8 image name shown incorrectly in the nat-edtior

pencil
Priority: Urgent
Current State: Closed
Released In: 1.2.0
Target Release: minor
Applies To: Engine
Component:
Branches: master
Reported By: JozefMojzis
Waiting For:
Last Change By: CrawfordCurrie

UTF8 image name shown incorrectly in the nat-edtior

Probably only Mac OS X issue (NFC/NFD)?

How to reproduce

  • Create an image with UTF8 name
  • Attach it to some topic (add link to the topic)
  • Edit with Wysiwyg editor (everything works ok)
  • Switch to the natedit
  • The image-name shown incorrectly (anyway, after the save - the image-name is OK)

Screenshots

  • attach1.png: in the TMCE
    attach1.png

-- JozefMojzis - 27 Mar 2015

Can you check this please, and close it unless there's a problem (in which case raise the priority to Urgent). Thanks.

-- CrawfordCurrie - 19 May 2015

Meantime for me things getting worse. Now, the linked image doesn't shown nor in the wysiwyg. Attached screenshots.

But, maybe this is an NFC/NFD problem, and therefore I can have problems, while for the rest of the word (read: for non mac based Foswiki installations) everything could be OK. And we agreed than the NFC/NFD problem isn't urgent now. (My Foswiki server runs on OS X, e.g. I'm not only an client who using Mac).

So, here is needed to test this by someone - non mac user. A week ago (15.may) I added to the Item13378 one short perl script what will create a directory on your HDD with TWO files (one with NFD name and one with NFC). Could you please run it and try attach both files into some topic, to see how an "common" (linux) system works?

And if here will be problem - this is urgent, if no problem it should remain as normal, (because sometime we will need fix the Mac problem too) smile Btw, the fix will be easy one global editing command smile (at least by my current tests) - change every Encode::decode_utf8 to Unicode::Normalise::NFC Encode::decode_utf8 . And everything should work. (ofc, un-symetrically, so doesn't needs any change to the encode_utf8). But will test this when the current trunk get stabilised.

Also, you don't implemented one central encode/decode routine and you calling directly the Encode::decode_utf8, instead of Foswiki::DecodeUTF8 (or such, what is pity, because in case having in the Foswiki.pm an routine
sub DecodeUTF8 { Encode::decode_utf8(@_); } any encoding/decoding change could be even more easier... smile

  • normal view:
    screenshot 76.png

  • wysiwyg edit:
    screenshot 77.png

  • After to natedit switch (note the filename):
    screenshot 75.png

-- JozefMojzis - 23 May 2015

Thanks for the report, but I really need to see what codepoints are being used for the characters. You have only attached images.

-- Main.CrawfordCurrie - 29 May 2015 - 07:00

Works fine on Linux (latest trunk, attachment name (unicode charcodes) cc e6 109 105 1e41 113 144 e3, Linux client, Chrome) I have uprated it to Urgent per your request, and centralised character set handling to make it easier for you to experiment with normalisation.

-- Main.CrawfordCurrie - 29 May 2015 - 08:50

Ad the exact codepoints:
  • testing the NFD problem with the NFC filename, (cc e6 109 105 1e41 113 144 e3) doesn't seems to me much relevant... wink smile
    • U+000CC LATIN CAPITAL LETTER I WITH GRAVE
    • U+000E6 LATIN SMALL LETTER AE
    • U+00109 LATIN SMALL LETTER C WITH CIRCUMFLEX
    • U+00105 LATIN SMALL LETTER A WITH OGONEK
    • U+01E41 LATIN SMALL LETTER M WITH DOT ABOVE
    • U+00113 LATIN SMALL LETTER E WITH MACRON
    • U+00144 LATIN SMALL LETTER N WITH ACUTE
    • U+000E3 LATIN SMALL LETTER A WITH TILDE
  • the code points in the my example, are the same as you get from the script what i mentioned.

Sorry, my english isn't enough good, ( is not much better as is the google translate ) so now will try to repeat myself again, (with other words), i hope it will be more clear. Sorry again for this. frown, sad smile So,
  • In the http://foswiki.org/Tasks/Item13378 is one short perl script, i modified it a bit to EXACTLY cover this test requirements. Check the attachment here.
  • run it
  • it will create one random-named directory with 4 files
  • check the code points for the files
  • try upload to YOUR foswiki the NDF.png one (the file with the longer name) - if you want, here are the code points: \N{U+0043}\N{U+030c}\N{U+0061}\N{U+0301}\N{U+0052}\N{U+030c}\N{U+0079}\N{U+0301}\N{U+002e}\N{U+0070}\N{U+006e}\N{U+0067}
  • add its name into upload-form description field
  • check the "Create a link to the attached file" checkbox
  • upload
  • check the result...
    • the link content inserted into the text
    • try edit with wysiwyg (the image isn't visible)

Unfortunately, my notebook is OS X. Therefore all filenames are ENFORCED NFD. I can't simulate the "NFC" world, but you (on Linux) CAN do the both tests - with the help of the attached script.

However, if the attachments works OK on the Linux, we can ignore this OS X specific problem. (for now). On the Linux probably nobody will craft NFD filenames. The problem happens (probably) only when the OS X user uploading an image with wide characters..

-- JozefMojzis - 29 May 2015

Your english is fine, I just misunderstood your problem. On linux, the NFD and NFC filenames are unique (as you'd expect, since filenames are byte strings). When I unpack a readdir, I see this:

 c4  8c  c3  a1  c5  98  c3  bd  2e  74  78  74
 43  cc  8c  61  cc  81  52  cc  8c  79  cc  81  2e  74  78  74
 
 c4  8c  c3  a1  c5  98  c3  bd  2e  70  6e  67
 43  cc  8c  61  cc  81  52  cc  8c  79  cc  81  2e  70  6e  67

When uploading the files, I get the same upload for both png's and the same upload for both .txt's i.e. they are canonically equivalent. The links insert fine, are correct, and WYSIWYG works fine.

-- CrawfordCurrie - 01 Jun 2015

After discussion on IRC I understand the problem a bit better, and it's related to the processing of the src attribute on the img tag. Easily fixed by removing the decoding.

-- CrawfordCurrie - 02 Jun 2015
 

ItemTemplate edit

Summary UTF8 image name shown incorrectly in the nat-edtior
ReportedBy JozefMojzis
Codebase trunk
SVN Range
AppliesTo Engine
Component
Priority Urgent
CurrentState Closed
WaitingFor
Checkins distro:2ab35f8fe55e
TargetRelease minor
ReleasedIn 1.2.0
CheckinsOnBranches master
trunkCheckins
masterCheckins distro:2ab35f8fe55e
ItemBranchCheckins
Release01x01Checkins
I Attachment Action Size Date Who Comment
attach1.pngpng attach1.png manage 18 K 27 Mar 2015 - 00:29 JozefMojzis  
attach3.pngpng attach3.png manage 26 K 27 Mar 2015 - 00:33 JozefMojzis  
nfcdimg.pl.txttxt nfcdimg.pl.txt manage 4 K 29 May 2015 - 15:31 JozefMojzis the test script
screenshot_75.pngpng screenshot_75.png manage 38 K 23 May 2015 - 05:25 JozefMojzis After to natedit switch (note the filename)
screenshot_76.pngpng screenshot_76.png manage 47 K 23 May 2015 - 05:24 JozefMojzis normal view
screenshot_77.pngpng screenshot_77.png manage 28 K 23 May 2015 - 05:25 JozefMojzis wysiwyg edit
Topic revision: r8 - 02 Jun 2015, CrawfordCurrie
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy