-1

When parsing my string from a file on Windows 10 I kinda have two characters that are not removable by whitespaces trims and such.

enter image description here

Here is evidence of the culprit.

This somewhat screws up my regex ^(\w+) because it happens that there is a whitespace in it. When I copy the value of the string (screenshot) into RegExr for example I see there is a whitespace added - and that is why my regex will net work.

enter image description here

I already googled for -1 -2 in UTF-8 string but was not able to find anything and therefore am super confused with that.

xetra11
  • 7,671
  • 14
  • 84
  • 159
  • These screenshots are useless unless you're okay with a wild goose chase. Copy+paste the problematic string into your question. – MonkeyZeus Dec 07 '20 at 19:03

1 Answers1

2

Your debugger is being silly for showing them as -1 and -2 respectively, but it's clear enough that you're dealing with the UTF-16 BOM (not UTF-8 as you claim in the question, that one is a 3-byte marker that's completely different).

Feel free to check for their presence and remove them if you encounter them at the beginning of a file, though ideally you should save your file without the BOM in the first place.

Blindy
  • 65,249
  • 10
  • 91
  • 131