14

I've looked at other solutions here and here, but it's not working for me.

Code

$s1clean = 'ALIEN - FILM - MOVIE – PSP – Sony - Boxed & Complete';
echo $s1clean;
echo "<br><br>";

// Remove dash
$s1clean = str_replace('-', '', $s1clean);

// Remove em dash
$em_dash = html_entity_decode('&#x2013;', ENT_COMPAT, 'UTF-8');
$s1clean = str_replace($em_dash, '', $s1clean);

$em_dash2 = html_entity_decode('&#8212;', ENT_COMPAT, 'UTF-8');
$s1clean = str_replace($em_dash2, '', $s1clean);

$s1clean = str_replace('\u2014', '', $s1clean);

echo $s1clean;
echo "<br><br>";

Output

"ALIEN FILM MOVIE – PSP – Sony Boxed & Complete"

How do I remove this character?

Script47
  • 14,230
  • 4
  • 45
  • 66
user3314053
  • 239
  • 1
  • 3
  • 11
  • More explanation about types of dashes, endash is the short one, emdash is the long one - > Do not mistake the em dash (—) for the slightly narrower en dash (–) or the even narrower hyphen (-). Those marks serve different purposes and are further explained in other sections. [link](http://www.thepunctuationguide.com/em-dash.html) – Mousey Aug 14 '15 at 22:48

4 Answers4

13

This specifies an array of possible removals,

$s1clean = 'ALIEN - FILM - MOVIE – PSP – Sony - Boxed & Complete';

$s1clean = str_replace(["-", "–"], '', $s1clean);

echo $s1clean;

When ran,

Ouput

ALIEN FILM MOVIE PSP Sony Boxed & Complete

I simply copied the weird dash and added it with the actual dash possibility and it worked.

Reading Material

str_replace

Script47
  • 14,230
  • 4
  • 45
  • 66
5

The above didn't work for me but this did:

$s1clean = str_replace(chr(151), '', $s1clean); // emdash

Note: for endash use

$s1clean = str_replace(chr(150), '', $s1clean); // endash

from Jay: http://php.net/manual/en/function.str-replace.php#102465

Aba
  • 584
  • 6
  • 11
2

Your dashes are a mix of long dash and hypen-minus (short dash) - -if you view your code and the title in a different font you will see the difference.

There are 2 short dashes at the start that your code removes, and some long dashes later that it doesn't remove.

Adding this will fix it (this is a different dash even if it doesn't look like one):

$s1clean = str_replace('–', '', $s1clean);

Edit

Alternatively duplicate the 2013 code line but use the hyphen-minus's code 002D instead of 2013:

 $em_dash = html_entity_decode('&#x002D;', ENT_COMPAT, 'UTF-8'); 

If you edit in a fixed width font both appear the same, but are not.

Mousey
  • 1,855
  • 19
  • 34
  • `Alternatively duplicate the 2013 code line but use the other dash's code 2014 instead of 2013`. What makes you assume the given character '–' is the `—` entity ? This is a wrong assumption. – Halim Qarroum Aug 14 '15 at 22:33
  • @HalimQarroum it's not an assumption, see the link I posted with the Hex and Decimal codes. – Mousey Aug 14 '15 at 22:42
  • No. Interpreted as `UTF-8`, the character posted by the OP is the character 'EN DASH' (U+2013), known as the `–`entity, or `–`. You can compare them here http://www.fileformat.info/info/unicode/char/2013/browsertest.htm. – Halim Qarroum Aug 14 '15 at 22:50
  • Yes absolutely. But I was referring about the character in the initial string literal `$s1clean`. You are right that the code of the OP is confusing and misleading, and also happens to be incorrect since he is not consistently replacing the same character (`\u2014` is different from `EN DASH` (U+2013)). – Halim Qarroum Aug 14 '15 at 23:11
  • Where do you see an `EM DASH` ? – Halim Qarroum Aug 14 '15 at 23:51
  • No, that's an `EN DASH`. Try something like this [UTF-8 resolver](http://www.cogsci.ed.ac.uk/~richard/utf-8.cgi) by pasting one of the "long dashes", you will see that it'll resolve an `EN DASH`. [See the result here](http://www.cogsci.ed.ac.uk/~richard/utf-8.cgi?input=%E2%80%93&mode=char) – Halim Qarroum Aug 15 '15 at 00:56
  • 1
    @HalimQarroum it was a `HYPHEN-MINUS` rather than a mix of `EN DASH and `EM DASH` - thanks for pointing out the issue (although at first I thought you were saying both dashes were the same characters). I've updated my answer. Deleting some old comments here to avoid chat. – Mousey Aug 15 '15 at 01:24
2

This one works for me

$title = "Hunting, Tactical & Outdoor Optics eCommerce Store ΓÇô $595,000 ΓÇö SOLD";
$title = str_replace(html_entity_decode('&ndash;', ENT_COMPAT, 'UTF-8'), '-', $title);
$title = str_replace(html_entity_decode('&mdash;', ENT_COMPAT, 'UTF-8'), '-', $title);
Aminah Nuraini
  • 18,120
  • 8
  • 90
  • 108