0

I need to remove [LS], [LS] character only appears when pasted on notepad++ this data was inserted hidden and visible only on UTF-8 encoding editor. And also character such as phone;email;fax.

I used below codes :

string.replaceAll("\\p{Cntrl}", "").replaceAll("[^\\p{Print}]", "");

but also replace the Chinese characters that should not be removed. Is there any way to remove hidden character and iconic character without removing the language character?

currarpickt
  • 2,290
  • 4
  • 24
  • 39
  • Could you add more examples? – Ethan Dec 08 '16 at 04:26
  • More information please. Was unicode text pasted into Notepad++ and not displayed properly? Where do the highlighted ASCII LS characters in the image come from, how were they produced? Possibly related: [How can I edit Unicode text in Notepad++?](http://superuser.com/questions/21135/how-can-i-edit-unicode-text-in-notepad) – traktor Dec 08 '16 at 05:02
  • The information was from email then user copy the whole email and paste to the application and successfully saved. But when retrieving the data we got exception. Until i found out when I copy the data on notepadd++ that has UTF-8 Encoding, there's strange character that is not visible on database / notepad. So I guess there could be other character aside from [LS] – John Edward Delos Reyes Dec 08 '16 at 05:51

1 Answers1

0

JavaScript or Java? Well, you said

Remove hidden character and special character (Java / JavaScript)

so I suppose JavaScript solution is acceptable too. You can achieve it by a simple regex:

string.replace(/[\xa0\x00-\x09\x0b\x0c\x0e-\x1f\x7f]/g, '');

It will remove all invisible characters, but not letters and digits etc.

  • @JohnEdwardDelosReyes. How didn't it work? You are trying to do it in JavaScript or Java? Did it throw any errors? Also, what character codes (in unicode) you want to remove exactly? –  Dec 08 '16 at 15:02