We need to import CSV files to MySQL which contain wrong written umlauts.
E.g.: instead of Ü
(ASCII 154), someone with a non German keyboard entered U
(ASCII 85) and added two top dots using ASCII 249, which looked the same to him.
MySQL writes this as U?
to the DB. That's why we want PHP to detect non ASCII character combinations, like this combination of a printable ASCII character and an extended ASCII character, that does not exist in the real world, at least not in the major languages.
The preg_replace
functions we have tried, do not detect this or detect also valid umlauts.
Any chance to succeed with preg_replace
or is there another way?