3

The problem looks simple, but it's taking time to figure out. I need to get rid of ndash characters from some strings in a project. Not the HTML entity –, but the actual character ( ). Using str_replace() and preg_replace() didn't work.

Already tried:

$new_str = str_replace('–', '', $str_with_ndash_char);     

Also tried:

$new_str = preg_replace('/–/', '', $str_with_ndash_char);  

Also, it's a legacy project. Some parts of it are iso-8859-1 encoded, and a few others are utf-8 encoded. I noticed that my editor (Komodo Edit) complains about the ndash character when a PHP file is iso-8859-1, losing the character when I save the file, like this:

$new_str = str_replace('?', '', $str_with_ndash_char);

Converting everything to utf-8 results in a lot of garbage characters (same for the other way around, converting everything to iso-8859-1), so I'm avoiding doing it unless it's really, really necessary.

Edited: removed double $ signs (bad CTRL+V).

Emerson
  • 61
  • 7

3 Answers3

1

I just tried out what you are doing and worked just fine, make sure that is is n-dashed in the string and not em-dashes.

I tried replacing both the different types and found no issues.

$str = str_replace('—', '', '–test—');   
echo $str . '</br>';
$str = str_replace('–', '', $str);     
echo $str;

This gives me the result:

-test
test

Some more concrete example would be nice as well. Like the strings you are trying to change and not just the variables.

Tobias
  • 65
  • 7
  • For example, this string: "Tributário – Trabalhista – Previdenciário – Tempo de Guarda de Documentos" – Emerson Nov 24 '17 at 13:20
  • Running `$str = str_replace('–', '', "Tributário – Trabalhista – Previdenciário – Tempo de Guarda de Documentos");` works just fine for me – Tobias Nov 24 '17 at 14:23
  • What if you tried both my methods I posted on the same string and echoed them out between takes? Any difference? – Tobias Nov 24 '17 at 14:25
  • No difference. The string comes form an iso-8859-1 encoded page. Could it be changed somehow when I try to replace it in an utf-8 encoded module? – Emerson Nov 24 '17 at 17:36
  • You tried to convert the string to UTF-8 you said, what method are you using? – Tobias Nov 24 '17 at 22:37
  • At first I was passing the string to the utf8_encode function, then I also did try mb_convert_encoding. Both had no effect. – Emerson Nov 28 '17 at 09:50
1

It seems that error is in redundant $ sign. It should be exactly one dollar sign in a variable.
So the line $new_str = str_replace('–', '', $str_with_ndash_char); should work fine.

But if it all OK in code in your project, the you should check out this answer

Also, try to switch error reportin mode to E_ALL. Place error_reporting(E_ALL); on the top of your script

Nestor Yanchuk
  • 1,186
  • 8
  • 10
0

Finally solved it:

$new_str = preg_replace("/[^[:alnum:]]/", '', $old_str)

Be warned that this will remove everything that's not alphanumeric, not just the ndash character. In my case, I don't need any character other than the alphanumeric ones.

Emerson
  • 61
  • 7