1
$ foo="1,2,3,6,7,8,11,13,14,15,16,17"

In shell, how to group the numbers in $foo as 1-3,6-8,11,13-17

anubhava
  • 761,203
  • 64
  • 569
  • 643
rodee
  • 3,233
  • 5
  • 34
  • 70
  • 1
    Generally speaking, one-liners compromise something. Maybe that something is readability, maybe it's correctness, maybe it's corner-case handling -- but there's pretty much always a tradeoff. – Charles Duffy Sep 19 '17 at 14:58
  • 5
    BTW -- have you tried doing this yourself? Do you have any code reflecting that attempt? Where did you get stuck? – Charles Duffy Sep 19 '17 at 14:59
  • Am not good in shell, tried to convert https://stackoverflow.com/questions/15867557/finding-gaps-sequential-numbers and failed :( – rodee Sep 19 '17 at 15:03
  • Huh. The code given there will emit a range of size 2 as 6-7 instead of 6,7. Which of those behaviors do you prefer? – Charles Duffy Sep 19 '17 at 15:16
  • 6-7 is preferred – rodee Sep 19 '17 at 15:19
  • Ahh. I was actually going out of my way to make it `6,7` in that case (since I think that makes more sense -- why use span logic when it doesn't save any characters?), but that's not essential. – Charles Duffy Sep 19 '17 at 15:21
  • 1
    ...that said, if you don't want that logic, you can just comment out the `elif` condition that creates it in `emit_range`. – Charles Duffy Sep 19 '17 at 15:22

3 Answers3

2

Given the following function:

build_range() {
  local range_start= range_end=
  local -a result

  end_range() {
      : range_start="$range_start" range_end="$range_end"
      [[ $range_start ]] || return
      if (( range_end == range_start )); then
        # single number; just add it directly
        result+=( "$range_start" )
      elif (( range_end == (range_start + 1) )); then
        # emit 6,7 instead of 6-7
        result+=( "$range_start" "$range_end" )
      else
        # larger span than 2; emit as start-end
        result+=( "$range_start-$range_end" )
      fi
      range_start= range_end=
  }

  # use the first number to initialize both values
  range_start= range_end=
  result=( )
  for number; do
    : number="$number"
    if ! [[ $range_start ]]; then
      range_start=$number
      range_end=$number
      continue
    elif (( number == (range_end + 1) )); then
      (( range_end += 1 ))
      continue
    else
      end_range
      range_start=$number
      range_end=$number
    fi
  done
  end_range
  (IFS=,; printf '%s\n' "${result[*]}")
}

...called as follows:

# convert your string into an array
IFS=, read -r -a numbers <<<"$foo"

build_range "${numbers[@]}"

...we get the output:

1-3,6-8,11,13-17
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • Why do you use `:` in front of assignments? – Benjamin W. Sep 19 '17 at 15:28
  • @BenjaminW., those are debug statements for use with `set -x`, so one can see the values when tracing the code. – Charles Duffy Sep 19 '17 at 15:43
  • I'm not sure I understand. I see that you need that for, say, `: ${param='value'}` so you don't execute the expansion as a command, but for debugging? I can see what the line does also without `:` when I run with `set -x`. – Benjamin W. Sep 19 '17 at 16:09
  • 1
    @BenjaminW., it does *nothing* without `set -x` -- that's the point. It's a noop, not an assignment, until you need to see variables' values in your trace. The `:` is intended to make that clear to the reader (that it's a statement with no side effects, which could be commented out or removed without effect). – Charles Duffy Sep 19 '17 at 16:13
  • Oh, I get it know. I use bashdb ;) – Benjamin W. Sep 19 '17 at 16:45
1

awk solution for an extended sample:

foo="1,2,3,6,7,8,11,13,14,15,16,17,19,20,33,34,35"

awk -F',' '{
                r = nxt = 0; 
                for (i=1; i<=NF; i++) 
                    if ($i+1 == $(i+1)){ if (!r) r = $i"-"; nxt = $(i+1) } 
                    else { printf "%s%s", (r)? r nxt : $i, (i == NF)? ORS : FS; r = 0 }
           }' <<<"$foo"

The output:

1-3,6-8,11,13-17,19-20,33-35
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
0

As an alternative, you can use this awk command:

cat series.awk
function prnt(delim) {
   printf "%s%s", s, (p > s ? "-" p : "") delim
}
BEGIN {
   RS=","
}
NR==1 {
   s = $1
}
p < $1-1 {
   prnt(RS)
   s = $1
}
{
   p = $1
}
END {
   prnt(ORS)
}

Now run it as:

$> foo="1,2,3,6,7,8,11,13,14,15,16,17"
$> awk -f series.awk <<< "$foo"
1-3,6-8,11,13-17

$> foo="1,3,6,7,8,11,13,14,15,16,17"
$> awk -f series.awk <<< "$foo"
1,3,6-8,11,13-17

$> foo="1,3,6,7,8,11,13,14,15,16,17,20"
$> awk -f series.awk <<< "$foo"
1,3,6-8,11,13-17,20

Here is an one-liner for doing the same:

awk 'function prnt(delim){printf "%s%s", s, (p > s ? "-" p : "") delim}
BEGIN{RS=","} NR==1{s = $1} p < $1-1{prnt(RS); s = $1} {p = $1}END {prnt(ORS)}' <<< "$foo"

In this awk command we keep 2 variables:

  1. p for storing previous line's number
  2. s for storing start of the range that need to be printed

How it works:

  1. When NR==1 we set s to first line's number
  2. When p is less than (current_number -1) or $1-1 that indicates we have a break in sequence and we need to print the range.
  3. We use a function prnt for doing the printing that accepts only one argument that is end delimiter. When prnt is called from p < $1-1 { ...} block then we pass RS or comma as end delimiter and when it gets called from END{...} block then we pass ORS or newline as delimiter.
  4. Inside p < $1-1 { ...} we reset s (start range) to $1
  5. After processing each line we store $1 in variable p.
  6. prnt uses printf for formatted output. It always prints starting number s first. Then it checks if p > s and prints hyphen followed by p if that is the case.
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • If you prefer one liner then use: `awk 'function prnt(delim){printf "%s%s", s, (p > s ? "-" p : "") delim} BEGIN{RS=","} NR==1{s = $1} p < $1-1{prnt(RS); s = $1} {p = $1}END {prnt(ORS)}' <<< "$foo"` – anubhava Sep 19 '17 at 17:14
  • can you please add this oneliner also to the solution section and explain? thanks, this helps. – rodee Sep 19 '17 at 18:21
  • 1
    thanks a lot, also see if you can answer the other relevant question: https://stackoverflow.com/questions/46288292/how-to-compute-the-range-in-makefile/46289768#46289768 – rodee Sep 19 '17 at 18:33
  • Ok I have added explanation, now looking at your other question. – anubhava Sep 19 '17 at 18:45