13

I have a string with some non-printable ascii characters in it, something like:

"ABCD\x09\x05\r\n"

I want to replace these characters with a ascii string representation of the hex code numbers, so I get something like this:

"ABCD[09][05][0D][0A]"

Whats the best way to do this? Can a regex be used?

Kevin Panko
  • 8,356
  • 19
  • 50
  • 61
user380689
  • 1,766
  • 4
  • 26
  • 39

3 Answers3

24

The pattern \p{Cc} matches any control character, so

Regex.Replace(input,
              @"\p{Cc}", 
              a=>string.Format("[{0:X2}]", (byte)a.Value[0])
            );

would also replace control characters.

drf
  • 8,461
  • 32
  • 50
  • nice. simple and easy to follow. – user380689 Aug 12 '11 at 04:03
  • Note that you can't easily reverse this back into the original message as "ABCD[09]\x09\x05\r\n" would result in "ABCD[09][09][05][0D][0A]". You'd want to escape opening brace (typically using "[[" so that my example would become "ABCD[[09][09][05][0D][0A]" – Tom West Dec 19 '16 at 21:27
  • 1
    @Tom West: That doesn't solve the problem. To reverse your escaped test, you'd also have a problem if the source contains "[[09]]", for example. There is no algorithm I can think of that would allow for a forward-character string replacement as well as a reverse-to-source process using a single string. – Jazimov Jul 22 '19 at 15:16
8
string s = "ABCD\x09\x05\r\n";
var replace = 
    s.Select(c => Char.IsControl(c) ? ((int)c).ToString("X2") : c.ToString())
     .Aggregate(new StringBuilder(), (sb, t) => sb.Append(t))
     .ToString();

Sorry, no compiler handy, but I think this compiles and does the job.

Also, this kind of walks the string twice (once to project each character to a hex replacement or a string, and then again to aggregate), and you can avoid this by lumping the projection into the call to Enumerable.Aggregate, but this is more clear and it probably doesn't matter that much unless this is performance-critical.

jason
  • 236,483
  • 35
  • 423
  • 525
4

Inspired by Jason's example, but a bit simpler. I'm not sure which performs better, and don't have the time to benchmark it right now, but it should do everything in just one pass:

string s = "ABCD\x09\x05\r\n";
string replace = String.Concat(s.Select(c => Char.IsControl(c) ?
                                             String.Format("[{0:X2}]", (int)c) :
                                             c.ToString()));

I've tested this for functionality.

Frank Szczerba
  • 5,000
  • 3
  • 31
  • 31