1

Background is that I have a log file that contains hex dumps that I want to convert with xxd to get that nice ASCII column that shows possible strings in the binary data.

The log file format looks like this:

My interesting hex dump:
00 53 00 6f 00 6d 00 65 00 20 00 74 00 65 00 78
00 74 00 20 00 65 00 78 00 61 00 6d 00 70 00 6c
00 65 00 20 00 75 00 73 00 69 00 6e 00 67 00 20
00 55 00 54 00 46 00 2d 00 31 00 36 00 20 00 69
00 6e 00 20 00 6f 00 72 00 64 00 65 00 72 00 20
00 74 00 6f 00 20 00 67 00 65 00 74 00 20 00 30
00 78 00 30 00 30 00 20 00 62 00 79 00 74 00 65
00 73 00 2e

Visually selecting the hex dump and do xxd -r -p followed by a xxd -g1 on the result does exactly what I'm aiming for. However, since the number of dumps I want to convert are quite a few I would rather automate the process. So I'm using the following substitute command to do the conversion:

:%s/\(\x\{2\} \?\)\{16\}\_.*/\=system('xxd -g1',system('xxd -r -p',submatch(0)))

The expression matches the entire hex dump in the log file. The match is sent to xxd -r -p as stdin and its output is used as stdin for xxd -g1. Well, that's the idea at least.

The thing is that the above almost works. It produces the following result:

My interesting hex dump:
00000000: 01 53 01 6f 01 6d 01 65 01 20 01 74 01 65 01 78  .S.o.m.e. .t.e.x
00000010: 01 74 01 20 01 65 01 78 01 61 01 6d 01 70 01 6c  .t. .e.x.a.m.p.l
00000020: 01 65 01 20 01 75 01 73 01 69 01 6e 01 67 01 20  .e. .u.s.i.n.g. 
00000030: 01 55 01 54 01 46 01 2d 01 31 01 36 01 20 01 69  .U.T.F.-.1.6. .i
00000040: 01 6e 01 20 01 6f 01 72 01 64 01 65 01 72 01 20  .n. .o.r.d.e.r. 
00000050: 01 74 01 6f 01 20 01 67 01 65 01 74 01 20 01 30  .t.o. .g.e.t. .0
00000060: 01 78 01 30 01 30 01 20 01 62 01 79 01 74 01 65  .x.0.0. .b.y.t.e
00000070: 01 73 01 2e                                      .s..

All 00 bytes have mysteriously transformed into 01. It should have produced the following:

My interesting hex dump:
00000000: 00 53 00 6f 00 6d 00 65 00 20 00 74 00 65 00 78  .S.o.m.e. .t.e.x
00000010: 00 74 00 20 00 65 00 78 00 61 00 6d 00 70 00 6c  .t. .e.x.a.m.p.l
00000020: 00 65 00 20 00 75 00 73 00 69 00 6e 00 67 00 20  .e. .u.s.i.n.g. 
00000030: 00 55 00 54 00 46 00 2d 00 31 00 36 00 20 00 69  .U.T.F.-.1.6. .i
00000040: 00 6e 00 20 00 6f 00 72 00 64 00 65 00 72 00 20  .n. .o.r.d.e.r. 
00000050: 00 74 00 6f 00 20 00 67 00 65 00 74 00 20 00 30  .t.o. .g.e.t. .0
00000060: 00 78 00 30 00 30 00 20 00 62 00 79 00 74 00 65  .x.0.0. .b.y.t.e
00000070: 00 73 00 2e                                      .s..

What am I not getting here?

Of course I can use macros and other ways of doing this, but I want to understand why my substitution command doesn't do what I expect.

Edit:

For anyone that want to achieve the same thing I provide the substitution expression that works on an entire file. The expression above was only for testing purposes using the log file example also from above. The one below is the one that performs a correct conversion, modified based on the information Kent provided in his answer.

:%s/\(\(\x\{2\} \)\{16\}\_.\)\+/\=system('xxd -p -r | xxd -g1',submatch(0))
Roger
  • 163
  • 7

1 Answers1

1

very likely, the problem is string conversion in the system() The input will be converted into a string by vim, so does the output of your first xxd command.

You can try to extract that hex parts into a file. then:

xxd -r -p theFile|vim -

And then calling the system('xxd -g1', alltext), you are gonna get something else than 00 too.

This doesn't work in the same way of a pipe (xxd ...|xxd...). But unfortunately, the system() function doesn't accept pipes.

If you want to fix your :s command, you need to call systemlist() on your first xxd call to get the data in binary format, then pass it to the 2nd xxd:

:%s/\(\x\{2\} \?\)\{16\}\_.*/\=system('xxd -g1',systemlist('xxd -r -p',submatch(0)))

The cmd above will generate the 00s. since there is no string conversion.

However, when working with some data format other than plain string, perhaps we can use filters instead of calling system(). It would be a lot eaiser. For your example:

2,$!xxd -r -p|xxd -g1
Kent
  • 189,393
  • 32
  • 233
  • 301
  • Yep, I also quite certain the string conversion is to blame. `:h system` says that if the input is given as a string it is written to a file. I'm not clear on whether that happens in this case or not and if it does whether the file is binary or not. Operating on the range and doing `!xxd -r -p|xxd -g1` works as you suggested. I'm still curious what happens behind scenes with the "string conversion" though. – Roger Jun 11 '20 at 09:34
  • Didn't notice you edited your answer. Using `systemlist` works like a charm. Actually using `system('xxd -r -p|xxd -g1')` also works even though I was under the impression it should not (Vim ver 8.0 on Xubuntu 18.04). Thank a lot for most valuable input! – Roger Jun 11 '20 at 09:57