0

I get data from the database that is utf8 encoded. But somehow some old data contains latin1 characters.

So this

$encod = mb_detect_encoding($string, 'UTF-8', true);

always is correct.

Is it safe to always use utf8_decode() to check for latin1 characters like 'äöüß'???

$string = utf8_decode($string);
$search = Array(" ", "ä", "ö", "ü", "ß", "."); //,"/Ä/","/Ö/","/Ü/");
$replace = Array("-", "ae", "oe", "ue", "ss", "-"); //,"Ae","Oe","Ue");
$string = str_replace($search, $replace, strtolower($string));

Regards

spankmaster79
  • 21,555
  • 10
  • 42
  • 73
  • 1
    how about `mb_detect_encoding($string, 'ISO-8859-1,UTF-8', true);` ? – ajreal Aug 24 '11 at 15:20
  • @ajreal the string I get from the database is 'äääää' `code` mb_detect_encoding($this->_name, 'ISO-8859-1,UTF-8', true);`code` says 'ISO-8859-1' and utf8_decode($string) gives 'äääää', what shall I do? – spankmaster79 Aug 24 '11 at 16:44
  • 1
    try search for iconv, mb conversion over php manual/SO, should have few questions discuss before here – ajreal Aug 25 '11 at 06:27
  • @ajreal ok I'll read that, `code`mb_convert_encoding($this->_name, 'ISO-8859-1');`code` works for all characters accept ß so I'll try iconv now – spankmaster79 Aug 25 '11 at 08:59
  • @ajreal `mb_detect_encoding($string,'auto',true)` did work for me. But the main problem in the conversion was `strtolower($string)` as it is not save to convert special characters to lower with this function – spankmaster79 Aug 26 '11 at 08:12
  • 1
    how about this ? http://php.net/manual/en/function.mb-strtolower.php – ajreal Aug 26 '11 at 08:14
  • @areal I used mb_strtolower now and it works – spankmaster79 Aug 31 '11 at 09:50

2 Answers2

0

It seems to work without the utf8_encoding:

<?php
   $string = "äöüß";
   $search = Array(" ", "ä", "ö", "ü", "ß", "."); //,"/Ä/","/Ö/","/Ü/");
   $replace = Array("-", "ae", "oe", "ue", "ss", "-"); //,"Ae","Oe","Ue");
   $string = str_replace($search, $replace, strtolower($string));
   echo $string;
?>

DEMO: http://codepad.org/HGTyHkBU

Naftali
  • 144,921
  • 39
  • 244
  • 303
  • not from me ;-), but also not a really good answer as the string you put into $string is dependent on the character encoding of the file you save the code to. My data is coming from the datbase and is utf8 encoded but contains latin1 characters like 'äääää' which is utf8_decoded = äääää – spankmaster79 Aug 24 '11 at 16:38
-2

Use htmlspecialchars(); it is more safer for work. More info:

http://php.net/manual/en/function.htmlspecialchars.php

Akos
  • 1,997
  • 6
  • 27
  • 40