Awk, little endian order and 4 hex digits

Question

I suppose that I have a decimal number, e.g., 97254 ---> 00017BE6 (hex value) using:

echo "" | awk '{printf("%08X", 97254)}'

Now, if I want to convert hex number (00017BE6, in this case) into 4 numbers of 2 digits (max 8 numbers in input) in little endian order and CSV format, i.e. (target):

E6,7B,01,00

using only Awk (a function ad-hoc and return value, for example), how to do?

With:

awk '{for (i=1;i<=7;i=i+2) array[i]=substr($1,i,2)} END{for(a in array) print array[a]}'

I have:

Using:

awk '{for (i=1;i<=7;i=i+2) array[i]=substr($1,i,2)} END{for(a in array) print array[a]}' ORS=","

I have:

00,01,7B,E6,

but how to remove last comma e convert it in little endian order?

How do I go about this?

Any ideas?

At the top you ask for `E6,7B,01,00` as the desired output, which is consistent with the _little_ endianness mentioned in the title; at the bottom you ask about removing the trailing `,` from `00,01,7B,E6,` which implies _big_ endianness - which do you want? — mklement0, May 04 '15 at 03:07
Please note that using `for(a in array)` will _not_ enumerate the array indices in numerical order, because all arrays in `awk` are _associative_ arrays, and enumeration happens unpredictably by the internal _hash_ values of the indices (keys). For arrays with numerical keys, use `for (i=1; i<=length(array); ++i)`. — mklement0, May 04 '15 at 06:51
Well, thank you for you advice. It had entirely slipped my mind! — mikilinux, May 04 '15 at 08:39

John1024 · Accepted Answer · 2015-05-04T05:54:35.353

3

echo 00017BE6 | awk '{for (i=7;i>=1;i=i-2) printf "%s%s",substr($1,i,2),(i>1?",":"\n")}'
E6,7B,01,00

Using sprintf, we can start with the decimal number:

$ echo 97254 | awk '{hex=sprintf("%08X",$1); for (i=7;i>=1;i=i-2) printf "%s%s",substr(hex,i,2),(i>1?",":"\n");}'
E6,7B,01,00

How it works

for (i=7;i>=1;i=i-2)

This starts a loop over index i in which we count down from 7 to 1.
printf "%s%s",substr($1,i,2),(i>1?",":"\n")

This prints the desired substring followed by a comma or a newline. The construct i>1?",":"\n" is awk's form of a ternary statement. It returns , if i>1 or a newline otherwise.

edited May 04 '15 at 05:54

answered May 03 '15 at 22:57

John1024

109,961
14
137
171

I am not 100% sure, but `sprintf` should work on most `awk`. It work at least on my old `busybox awk` `BusyBox v1.19.4 ` – Jotne May 04 '15 at 05:52
2

@Jotne Thanks for that. I just checked and [`sprintf` is POSIX](http://pubs.opengroup.org/onlinepubs/009695399/utilities/awk.html#tag_04_06_13_13). I updated the answer to remove the equivocation. – John1024 May 04 '15 at 05:56
Thank you very much everyone for your suggestion. – mikilinux May 04 '15 at 08:42

mklement0 · Answer 2 · 2015-05-04T06:13:52.170

You've asked for an awk command, but consider this generic bash function, which uses printf, sed, and tac / tail -r internally, and works on both BSD (including OSX) and Linux systems:

# SYNOPSIS
#   toHexBytes num [numBytes [littleEndian]]
# DESCRIPTION
#   Prints the bytes that num is composed of in hex. format separated by commas.
#   NUM can be in decimal, hexadecimal, or octal format.
#   NUMBYTES specifies the minimum number of *bytes* to output - defaults to *4*.
#   Specify 0 to only output as many bytes as needed to represent NUM, '' to 
#   represent the default when also specifying LITTLEENDIAN.
#   By default, the bytes are printed in BIG-endian order; if LITTLEENDIAN is nonzero,
#   the bytes are printed in LITTLE-endian order.
# PLATFORM SUPPORT
#   BSD and Linux platforms
# EXAMPLES
#   toHexBytes 256      # -> '00,00,01,00'
#   toHexBytes 256 '' 1 # -> '00,01,00,00'
#   toHexBytes 0x100 0  # -> '01,00'
toHexBytes() {
  local numIn=$1 numBytes=${2:-4} littleEndian=${3:-0} numHex revCmd 
  # Convert to hex.
  printf -v numHex '%X' "$numIn"
  # Determine number of 0s that must be prepended.
  padCount=$(( numBytes * 2 - ${#numHex} ))
  (( padCount < 0 && ${#numHex} % 2 )) && padCount=1
  # Apply 0-padding, if needed.
  (( padCount )) && printf -v numHex "%0$(( padCount + ${#numHex} ))X" "0x$numHex" 
  if (( $littleEndian )); then # LITTLE-endianness
    # Determine command to use for reversing lines.
    [[ $(command -v tac) ]] && revCmd='tac' || revCmd='tail -r'
    # Insert a newline after every 2 digits, except for the last,
    # then reverse the resulting lines,
    # then read all resulting lines and replace all but the last newline
    # with ','.
    sed 's/../&\'$'\n''/g; s/\n$//' <<<"$numHex" | 
      $revCmd |
        sed -e ':a' -e '$!{N;ba' -e '}; s/\n/,/g'
  else # BIG-endianness
    # Insert ',' after every 2 digits, except for the last pair.
    sed 's/../&,/g; s/,$//' <<<"$numHex"
  fi
}

Applied to your example number:

$ toHexBytes 97254 4 1 # 4 bytes, LITTLE-endian
E6,7B,01,00

$ toHexBytes 97254 # 4 bytes, BIG-endian
00,01,7B,E6

Awk, little endian order and 4 hex digits

2 Answers2

How it works