5

I have a series of commands chained together with pipes:

should_create_one_line | expects_one_line

The first command should_create_one_line should produce an output that only has one line, but under strange circumstances it is possible for the output to be multiline or empty.

I would like to add a step in between these two, validate_one_line:

should_create_one_line | validate_one_line | expects_one_line

If its input contains exactly 1 line then validate_one_line will simply output its input. If its input contains more than 1 line or is empty then validate_one_line should cause the whole sequence of steps to stop and return an error code.

What command can I use for validate_one_line?

mushroom
  • 6,201
  • 5
  • 36
  • 63
  • 2
    Do you need to prevent `expects_one_line` from being run *at all* if the input is wrong? Because you can't do that with a single pipeline (as Mad Physicist's answer indicates). – Etan Reisner Jan 06 '16 at 15:43

3 Answers3

4

Use read. Here's a shell function that meets your specs:

exactly_one_line() {
    local line # Use to echo the line
    read -r line || return # Guarantee at least one line is read
    read && return 1 # Indicate failure if another line is successfully read
    echo "$line"
}

Notes

  1. "One line" assumes a single line followed by a newline. If your input could be like, a file with contents but no newlines, then this will fail.
  2. Given a pipeline like a|b, a cannot prevent b from running. At a minimum, b needs to handle when a produces no output.

Demo:

$ wc -l empty oneline twolines 
       0 empty
       1 oneline
       2 twolines
       3 total
$ exactly_one_line < empty; echo $?
1
$ exactly_one_line < oneline; echo $?
oneline
0
$ exactly_one_line < twolines; echo $?
1
kojiro
  • 74,557
  • 19
  • 143
  • 201
  • best solution i think – 123 Jan 06 '16 at 14:45
  • Just for the record this will "hang" on the second `read` if standard input to `exactly_one_line` doesn't get closed/see an EOF. Also this won't prevent `expects_one_line` from being run if there isn't a line of output. This will just convert the multiple lines case to the no lines case (in terms of what `exactly_one_line` will see). – Etan Reisner Jan 06 '16 at 15:41
  • Any piped version of `validate_one_line` will have the same issue. Validation needs to be done in the script that uses the data. – Mad Physicist Jan 06 '16 at 15:43
  • @EtanReisner I could set a timeout to prevent the hang. Do you suggest any better solution to that? As for the rest of the pipeline, in any pipeline `a|b` there isn't any way to prevent `b` from running if all you control is `a`. @mad-physicist is correct about that, but I didn't want to just assimilate that answer. – kojiro Jan 06 '16 at 15:53
  • No. I don't think you can do better than a timeout for that. You might, in theory, want to use `{ read -r line || [ -n "$line" ]; } || return` in case the output doesn't contain a newline but does contain data (though I don't know if that matches the OPs expectations). – Etan Reisner Jan 06 '16 at 16:04
  • @EtanReisner actually, I realized I don't know what you mean by the second `read` hanging. I just tried `yes | exactly_one_line` and it failed correctly. I guess you don't _just_ mean it needs an EOF, but also an EOF without any further newlines. – kojiro Jan 06 '16 at 16:10
  • I meant that pipelines traditionally process input/output in "real time". With that second `read` that isn't true if the initial process doesn't terminate within a reasonable time. Try `{ echo foo; sleep 30; } | exactly_one_line` and see when you get your output printed (as compared to, for example, `{ echo foo; sleep 30; } | cat`. (That's why I put "hang" in quotes because nothing is stuck just delayed.) – Etan Reisner Jan 06 '16 at 16:25
  • @EtanReisner yeah, so I guess the requirements question is "do you want a failure or a delay if the EOF doesn't come immediately after the first line?" – kojiro Jan 06 '16 at 17:04
  • Right. Two questions. Does `expects_one_line` need to not run at all and what does the behavior of `should_create_one_line` look like. (e.g. if it can produce two lines of output with a time delay of hours between them then waiting is the only option, etc.) – Etan Reisner Jan 06 '16 at 17:16
1

First off, you should seriously consider adding the validation code to expects_one_line. According to this post, each process starts in its own subshell, meaning that even if validate_one_line fails, you will get an error in expects_one_line because it will try to run with no input (or a blank line). That being said, here is a bash one-liner that you can insert into your pipe to validate:

should_create_one_line.sh | ( var="$(cat)"; [ $(echo "$var" | wc -l) -ne 1 ] && exit 1 || echo "$var") | expects_one_line.sh

The problem here is that when the validation subshell returns in the exit 1 case, expects_one_line.sh will still get a single blank line. If this works for you, then great. If not, it would be better to just put the following into the beginning of expects_one_line.sh:

input="$(cat)"
[ $(echo "$var" | wc -l) -ne 1 ] && exit 1

This would guarantee that expects_one_line.sh fails properly when getting a single line without having to wonder about what the empty line that the validation outputs will do to the script.

You may find this post helpful: How to read mutliline input from stdin into variable and how to print one out in shell(sh,bash)?

Community
  • 1
  • 1
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • Like kojiro's answer this will "hang" as long as the input stream doesn't terminate (which may or may not be a problem for the OP) but does make this potentially unusable if `should_create_one_line` runs for a long time after generating its single line of output and `expects_one_line` doesn't want to wait for it to finish. – Etan Reisner Jan 06 '16 at 16:05
  • I think that the idea of "one line" can only be defined in terms of an EOF arriving within a finite amount of time. I do not think that the OP is considering cases where things take a while for that reason. – Mad Physicist Jan 06 '16 at 16:15
  • Entirely possible. I'd just added the info to the other answer so thought I should be diligent and add it here as well. – Etan Reisner Jan 06 '16 at 16:21
0

You can use a bash script to check the incoming data and call the other command when the input is only 1 line

The following code starts cat when it is ONLY fet in 1 line

sh -c 'while read CMD; do [ ! -z "$LINE" ] && exit 1; LINE=$CMD; done; [ -z "$LINE" ] && exit 1; printf "%s\n" $LINE | "$0" "$@"' cat

How this works

  1. Try reading a line, if failed go to step 5
  2. If variable $LINE is NOT empty, goto step 6
  3. Save line inside variable $LINE
  4. Goto step 1
  5. If $LINE is NOT empty, goto step 7
  6. Exit the program with status code 1
  7. Call our program and pass our $line to it using printf

Example usage:

Printing out only if grep found 1 match:

 grep .... | sh -c 'while read CMD; do [ ! -z "$LINE" ] && exit 1; LINE=$CMD; done; [ -z "$LINE" ] && exit 1; printf "%s\n" $LINE | "$0" "$@"' cat

Example of the question poster:

 should_create_one_line | sh -c 'while read CMD; do [ ! -z "$LINE" ] && exit 1; LINE=$CMD; done; [ -z "$LINE" ] && exit 1; printf "%s\n" $LINE | "$0" "$@"' expects_one_line
Ferrybig
  • 18,194
  • 6
  • 57
  • 79