If the strings all have the same number of digits and the same letter case, as is true in your sample data, you can just sort as-is. The default lexical sort will put them in the right order because the hex digits 0-9a-f occur in that order in standard character sets.
If you might have some numbers with different numbers of digits, or if you have mixed upper and lower case on the alphabetic digits, then your best bet is probably to convert to decimal, sort numerically, and convert back. If you have the GNU version of awk, you can use it to do the conversion:
awk -v{,O}FS=, '{$1=strtonum("0x"$1)}1' $filename |
sort -t, -n -k1,1 |
awk -v{,O}FS=, '{$1=sprintf("%x",$1)}1' >$new_filename
Run on your sample input, I get this output:
570e2e5c,1460539517,SOM3-String-123,08-5a-0c-59
570e2e5e,1460539518,SOM3-String-123,08-00-0c-59
570e2e60,1460539519,SOM3-String-123,09-00-68-22
570e2e81,1460539520,SOM3-String-123,08-00-0c-59
570e2e87,1460539521,SOM3-String-123,09-5e-6b-22
570e2e90,1460539522,SOM3-String-123,08-00-0c-59
570e2e95,1460539523,SOM3-String-123,09-00-67-22
Explanation:
-v name=value
tells awk
to set a variable that will exist in the context of the program; this is a handy way to inject values without having to deal with the quoting complexities that accompany string interpolation into code. But some variable names are special; FS
tells awk
what (F)ield (S)eparator to use to split the input lines into fields, and OFS
tells it what (O)utput field separator to use when printing lines back out. The -v{,O}FS=,
sequence is just a shortcut to set both variables to the same value; it gets expanded by the shell's brace-expansion into -vFS=, -vOFS=,
. So within the awk program, each line will be pre-split on commas into fields, with the individual field values in the variables $1
, $2
, and so on. When those variables are changed, they will be joined back together with commas when printed back out.
The code in the string passed to awk
after the -v
options is the awk
program to run. Such a program is a sequence of (condition, block) pairs; each condition is checked against each line of input, and then the block is evaluated if the condition is true. If a block appears without a condition, it is run on every line; if a condition appears without a block, it causes the current line – or the fields of the current line joined by OFS, if other blocks have made modifications – to be printed out.
The programs above use both defaults; each starts with a code block that has no condition, so it is run on every line, and ends with a condition that has no code block: 1, which is always true, so every line is printed back out after the code block has made its changes.
The first awk
program uses the strtonum
function to convert the first field to decimal. This is why this solution only works in GNU awk; BSD awk (which is also the one that ships on macOS) does not have strtonum
. We have to prepend "0x" to the field value so that strtonum
will know to treat it as hexadecimal, but once it is a numeric value, it will be printed out as decimal. So the output of the awk
is the same as the input except with the first field converted to decimal.
We feed that to sort
, tell it to sort numerically (-n
) by that first (-k1,1
) comma-separated (-t,
) field, and then feed the sorted output into the second awk
, which uses the sprintf
function to convert the numeric first field back to hex.