Item13997: Incorrect assumption about encodings in Foswiki::Store.
Priority: Normal
Current State: Closed
Released In: 2.1.1
Target Release: patch
Present
Foswiki::Store
implementation does one very incorrect assumption that encoding of the files is the encoding of filenames. In other words,
$Foswiki::cfg{Store}{Encoding}
is applied to both filename and file content. While presumably been tolerated on most of the OSes on OS X this assumption produces pretty strange result when the encoding is
iso8859-1
: file and directory names are converted into %FF URL-encoding. When file name is long enough (for all non-ASCII symbols it would be 86+ symbols) it gets three times longer after conversion and causes 'File name too long" error upon file/dir creation.
Here is a demo. The following script:
#!env perl
use v5.14;
use utf8;
use strict;
use warnings;
use Encode;
use File::Path;
my $s =
Encode::decode( 'iso-8859-1', join( '', map { chr($_) } ( 160 .. 244 ) ) );
my $n = Encode::encode( 'iso-8859-1', $s, Encode::FB_CROAK );
my $tempdir = "$ENV{HOME}/tmp/foswiki.del.me/$n";
File::Path::mkpath( $tempdir, 0, 0777 );
exit;
generates the following dir name:
$ ls ~/tmp/foswiki.del.me
%A0%A1%A2%A3%A4%A5%A6%A7%A8%A9%AA%AB%AC%AD%AE%AF%B0%B1%B2%B3%B4%B5%B6%B7%B8%B9%BA%BB%BC%BD%BE%BF%C0%C1%C2%C3%C4%C5%C6%C7%C8%C9%CA%CB%CC%CD%CE%CF%D0%D1%D2%D3%D4%D5%D6%D7%D8%D9%DA%DB%DC%DD%DE%DF%E0%E1%E2%E3%E4%E5%E6%E7%E8%E9%EA%EB%EC%ED%EE%EF%F0%F1%F2%F3%F4
Obviously, increasing the top range boundary to 245 will result in 'File name too long' because it would produce 86 symbol dir name.
Proposed solution
In addition to the
Encoding
configuration key
FilenameEncoding
should be introduced. It would default to
Encoding
unless set manually to a different value. The
Foswiki::Store::encode
would get one more optional parameter to define the key to be used and may look like:
sub encode {
return $_[0] unless defined $_[0];
my $s = $_[0];
my $encKey = $_[2] || 'Encoding';
if ( $_[1] ) {
return Encode::encode( $Foswiki::cfg{Store}{$encKey} || 'utf-8',
$s, Encode::FB_CROAK );
}
else {
return Encode::encode( $Foswiki::cfg{Store}{$encKey} || 'utf-8',
$s, sub { HTML::Entities::encode_entities( chr(shift) ) } );
}
}
Foswiki::Store::decode
would get similar adaptation, of course.
Foswiki::Store::PlainFile::_mkPathTo
would have to call it in the following way:
# Make all directories above the path
sub _mkPathTo {
my $file = _encode( shift, 1, 'FilenameEncoding' );
ASSERT( File::Spec->file_name_is_absolute($file), $file ) if DEBUG;
...
}
Same change would be required for the numerous other calls to
_encode
all across the
PlainFile.pm
module.
--
VadimBelman - 29 Feb 2016
Brief IRC brainstorming generated a solution of blocking iso8859 on OSX.
--
VadimBelman - 29 Feb 2016
I just do not understand why someone on OS X would even try to use iso-8859. The OS X
is fully unicode by default (and moreover the filesystem is
enforced) - so using iso1 on OS X is the same mistake as trying to use for example ASCII-7 on Linux with accented characters.. (Or i don't understand something...)
--
JozefMojzis - 29 Feb 2016
I would think of only possible scenario of migrating from another system. But then again – if brave enough of taking the moving venture then make encoding conversion be part of it.
--
VadimBelman - 29 Feb 2016
Added a test in the Store Encoding checker. Errors unless encoding is utf-8 / utf8.
--
GeorgeClark - 29 Feb 2016