1

When I upload a badly (or "utf8-ly") named file in a fresh TYPO3 7.6 install, I get underscores instead of spelled out special characters.

E.g. the filename Bräm!.png is sanitized to Bra__m_.png. I would expect Braem.png.

The server locale looks fine:

LANG=de_CH.UTF-8
LC_CTYPE="de_CH.UTF-8"
LC_NUMERIC="de_CH.UTF-8"
LC_TIME="de_CH.UTF-8"
LC_COLLATE="de_CH.UTF-8"
LC_MONETARY="de_CH.UTF-8"
LC_MESSAGES="de_CH.UTF-8"
LC_PAPER="de_CH.UTF-8"
LC_NAME="de_CH.UTF-8"
LC_ADDRESS="de_CH.UTF-8"
LC_TELEPHONE="de_CH.UTF-8"
LC_MEASUREMENT="de_CH.UTF-8"
LC_IDENTIFICATION="de_CH.UTF-8"
LC_ALL=

In localConfiguration, we have

'systemLocale' => 'de_CH.UTF-8',

And even, in php.ini, I tried

intl.default_locale = de_CH.UTF-8

Still, no "proper" renaming as I'd expect, renaming the File Bräm!.png to Braem.png or at least Braem_.png.

Where else could I look?

Urs
  • 4,984
  • 7
  • 54
  • 116

2 Answers2

0

From what you describe the name of the file is not encoded in UTF-8 but in a single byte character set (ISO-8859-1 for example). In \TYPO3\CMS\Core\Resource\Driver\LocalDriver::sanitizeFileName() UTF-8 is used if you use it in the backend (same for the old file handling functions).

In that case the "ä" isn't a valid multi-byte UTF-8 character and is thus replace by underscore characters.

  • Oh! Strange though, as I've created it on yosemite. Maybe you could provide a file with a properly utf-8 encoded filename for comparison? Or the other way round, would you look at the file if I upload it somewhere? – Urs Oct 31 '16 at 08:57
  • Looking at the code, it seems that it's intended that each "special" character will be replaced by an `_` – no rewriting from e.g. `ä` to `ae`. Correct? Maybe what made me expect more detailed sanitizing is some memory from 4.5? ... because in 6.2 it's also just underscores – Urs Oct 31 '16 at 22:57
0

Make sure [SYS][UTF8filesystem] = true in you LocalConfiguration.php

minifranske
  • 1,295
  • 6
  • 12
  • I'd like ti avoid utf-8 filenames, but would enjoy more explicit replacement (ä->ae, or ä->a). That's why I fiddled with locale in the first place – Urs Nov 04 '16 at 19:09
  • Then try disabling the utf8filesystem. The other settings you tried will not help here – minifranske Nov 04 '16 at 19:35
  • It's disabled. But I think my question is not well put, as I'm probably asking for something that's not there (character replacement like in realURL). Thanks anyway! – Urs Nov 04 '16 at 20:10
  • 1
    There is no solution in the core but a signal/slot in resourceStorage you can use to sanitize it the way you want before it is handled by the driver – minifranske Nov 05 '16 at 09:02
  • Cool, an incentive to make first steps with signal/slots! I've studied a few tutorials (https://usetypo3.com/signals-and-hooks-in-typo3.html, https://github.com/einpraegsam/powermailextended) and I think I got the basics somehow. I've included a class that extends resourceStorage and tried to call it from ext_localconf.php. Still, it breaks: the upload takes forever and nothing happens. Probably a lot of errors in the background. Maybe you could have a quick look at https://gist.github.com/ursbraem/3106b3605d368d76b2d63b826b0c1a11 ? – Urs Nov 05 '16 at 17:00
  • PS it's not supposed to do anything yet, just to see if I can pick up the signal – Urs Nov 05 '16 at 17:01
  • Here you can find some examples registering slots to use the signals triggered by the resourceStorage https://bitbucket.org/franssaris/fs_media_gallery/src/b42d5f842e639add1018f41b512717bfc69ca744/ext_localconf.php?at=master&fileviewer=file-view-default#ext_localconf.php-94 – minifranske Nov 05 '16 at 17:09
  • Have a look here https://git.typo3.org/TYPO3CMS/Extensions/image_autoresize.git/blob/HEAD:/ext_localconf.php#l10 an example of exactly the slot you need – minifranske Nov 05 '16 at 17:16