We faced an issue when someone copied a string from somewhere in one of the metadata xml files that contained ascii characters 239 (0xef), 191 (0xbf), 189 (0xbd)
We fixed the problem in known file, but I would like to check if similar problem exists in any other xml files. Following command did not find anything
grep '[^[:print:]]' <filename>
Following command looks promising, but it add other characters such as "<" and "/" etc.
grep -e "\W" <filename>
Since it's xml file and given string is element text, I can not use -v option of grep.
grep $'\xef' <filename>
Above command does mark the character, but too specific to go through 30,000 odd files and find the problem.
Is there any way I can use grep command to find problematic characters above. For the problematic string, most the entries are names of businesses and very unlikely to have odd ascii characters.