Item13660: The SEARCH returns mixed NFC/NFD on OS X (causing problem for example in the TreePlugin)
Priority: Normal
Current State: Duplicate
Released In: 2.1.0
Target Release: minor
The problem
The following
SEARCH returns mixed NFD/NFC strings on OS X.
%SEARCH{search=".*" web="Sandbox" format=" * $topic $parent" scope="topic" regex="on" nosearch="on" nototal="on" noempty="on"}%
the
$topic
is NFD normalised string, and the
$parent
is NFC.
Because the
TreePlugin calls %SEARCH internally, it gets confused from the results and therefore it didn't works OK on OS X.
Repo the TreePlugin issue (on OS X)
- create a topic "Caca" (parent WebHome)
- create a topic "ZuZu" (parent Caca)
- create a topic "ČaČa" (parent WebHome)
- create a topic "ŽuŽu" (parent ČaČa)
- use the
%TREEVIEW{}%
macro
The result (on OS X| will be something like
4 WebHome
4.1 CaCa
4.1.1 ZuZu
4.2 ČaČa
5 ŽuŽu
So, the for the
ascii
shows ok the parent/child relationship, but
not for the unicode topic.
Easy test for the SEARCH (on OS X)
0014360 151 076 040 074 154 151 076 040 132 314 214 165 132 314 214 165
i > < l i > Z ̌ ** u Z ̌ ** u
0014400 040 074 141 040 150 162 145 146 075 042 057 123 141 156 144 142
< a h r e f = " / S a n d b
0014420 157 170 057 045 143 064 045 070 143 141 045 143 064 045 070 143
o x / % c 4 % 8 c a % c 4 % 8 c
0014440 141 042 076 304 214 141 304 214 141 074 057 141 076 012 074 057
a " > Č ** a Č ** a < / a > \n < /
or using
noautolink
0010760 154 151 076 040 132 314 214 165 132 314 214 165 040 304 214 141
l i > Z ̌ ** u Z ̌ ** u Č ** a
0011000 304 214 141 012 074 057 154 151 076 074 057 165 154 076 040 012
Č ** a \n
e.g. clearly visible - the "ŽuŽu" is NFD encoded and its parent
ČaČa is NFC.
Quick & dirty fix for the TreePlugin
diff --git a/lib/Foswiki/Plugins/TreePlugin.pm b/lib/Foswiki/Plugins/TreePlugin.pm
index 5c30ca5..7829581 100644
--- a/lib/Foswiki/Plugins/TreePlugin.pm
+++ b/lib/Foswiki/Plugins/TreePlugin.pm
@@ -23,6 +23,7 @@ package Foswiki::Plugins::TreePlugin;
use strict;
use warnings;
+use Unicode::Normalize qw(NFC);
use Foswiki::Func;
@@ -530,7 +531,8 @@ sub doSEARCH {
"%SEARCH{search=\"$searchVal\" web=\"$searchWeb\" format=\"$searchTmpl\" scope=\"$searchScope\" regex=\"on\" nosearch=\"on\" nototal=\"on\" noempty=\"on\" excludetopic=\"$excludetopic\" topic=\"$includetopic\"}%";
&Foswiki::Func::writeDebug($search) if $debug;
- return Foswiki::Func::expandCommonVariables($search);
+ my $search_result = Foswiki::Func::expandCommonVariables($search);
+ return $Foswiki::UNICODE ? NFC($search_result) : $search_result;
}
=pod
Of course, the above
isn't correct solution. We need fix the NFC/NFD at it's roots - e.g. everywhere where we doing
decode_utf8($string)
we should do
NFX(decode_utf8($string))
, where the NFX is NFD or NFC - whatever on what reach consensus the core dev team.
--
JozefMojzis - 01 Sep 2015
Jomo, is this task fixed by the changes made for
Item13405?
--
GeorgeClark - 24 Dec 2015
Unfortunately no. The
SEARCH still returns for the
$search_result
string like:
Sandbox|Z\x{30c}uZ\x{30c}u|\x{10c}a\x{10c}a|\$outnum [[Sandbox.Z\x{30c}uZ\x{30c}u][Z\x{30c}uZ\x{30c}u]] <br />";
e.g the parent is returned as NFC -
ČaČa (x{10c}a\x{10c}a) - but the topicame
ŽuŽu is NFD (Z\x{30c}uZ\x{30c}u).
Applying the above patch (NFC-ing the search result) works.
--
JozefMojzis - 26 Dec 2015
Just for the record: tested on "be33f8f40df93f37d796caac361a23f3a7aa2655".
--
JozefMojzis - 26 Dec 2015
Ahh... it works. The
NFCNormalizeFilenames was unset. SOOOORRY for the caused confusion.

I set it 1st time and works everything as i reported in the
Item13405. And later, when you asked about this TREEPLUGIN test I forgot to set it again.
IMHO, this cfg setting is really bad idea to do it manually.

The configure (Foswiki.spec) is really so dumb and doesn't allows one simple condition as default? such
$Foswiki::cfg{NFCNormalizeFilenames} = 1 if $^O =~ /darwin/;
--
JozefMojzis - 30 Dec 2015
I wonder if there is some easy way to test if the file system is NFC or NFD. Rather than an OS test, which would miss remote file system situations.
--
GeorgeClark - 31 Dec 2015
I think the solution is a change in bootstrap. We create a file using a NFC filename in the data directory, and then read it back. If it changes, we are probably on a NFD system, and we can set the normalize flag correctly.
--
GeorgeClark - 31 Dec 2015
If you want be "politically correct"

the tests should be done per web and per directory based. Imagine an DBI based storage. The "data" is NFC but the
/pub
could be NFD.
Or, some
/data/Some
and/or
/pub/Another
could be symlinked to remote...
So, yes, youre right - the file creation test could help - but it isn't an bulletproof solution too. So, imho - is enough done the "simple
$^O
match to
darwin
. (at least util someone will not report some bug) :).
--
JozefMojzis - 31 Dec 2015
well, I implemented the test aganst the data directory. I think trying to probe every directory under the data and pub trees is excessive. And if installed on an OSX system, then the simple probe of data should be sufficient. I'd hope.
--
GeorgeClark - 31 Dec 2015
Setting this task to duplicate. It's fixed in
Item13405.
--
GeorgeClark - 31 Dec 2015
- this file is called ČáŘý - uploaded from OS X's NFD filesystem:
Error: (3) can't find %c4%8c%c3%a1%c5%98%c3%bd.png in Tasks