3

I am grep'ing the output of a command inside shell script and store the result in a variable.

There is a very corner case where this variable might have non-ascii characters because of parse logic used by grep.

Question: How do I remove these non-ascii characters from this variable inside the shell script, so that I can use the variable in the subsequent commands?

Sudar
  • 18,954
  • 30
  • 85
  • 131

2 Answers2

9

If you're using bash, and if your variable is called var, then

"${var//[^[:ascii:]]/}"

will expand to var with all non-ascii characters removed. So:

var_non_ascii=${var//[^[:ascii:]]/}

should do. (This is definitely the best method: no sub-shells and no forks to external processes to bash).

gniourf_gniourf
  • 44,650
  • 9
  • 93
  • 104
4

Assuming your variable is var, try this:

var=$(echo $var | sed 's/[^\x00-\x7F]//g')

This should remove the non-ascii characters

Guru
  • 16,456
  • 2
  • 33
  • 46
  • I am selecting the other answer since that doesn't involve an external process. Upvoted though :) – Sudar Dec 21 '12 at 13:58