1

We faced an issue when someone copied a string from somewhere in one of the metadata xml files that contained ascii characters 239 (0xef), 191 (0xbf), 189 (0xbd)

We fixed the problem in known file, but I would like to check if similar problem exists in any other xml files. Following command did not find anything

grep '[^[:print:]]' <filename>

Following command looks promising, but it add other characters such as "<" and "/" etc.

grep -e "\W" <filename>

Since it's xml file and given string is element text, I can not use -v option of grep.

grep $'\xef' <filename>

Above command does mark the character, but too specific to go through 30,000 odd files and find the problem.

Is there any way I can use grep command to find problematic characters above. For the problematic string, most the entries are names of businesses and very unlikely to have odd ascii characters.

user871199
  • 185
  • 5

0 Answers0