2

I'm relatively new to Github Workflows, so I'm learning the syntax as I go. I have a section where I want to check if start_date and end_date have been filled, if not, fill them with the default value. Then check if they are in the correct format, if not, exit the run. For some reason, when I try to set the environment variables of start_date and end_date either they aren't being set correctly or retrieved correctly as I keep getting empty strings instead of a date.

Here's my code:

      - name: Set up dates
        id: set_dates
        run: |
          if ! [[ -z "${{ inputs.start_date }}" && -z "${{ inputs.end_date }}" ]]; then
            echo "start_date=$(date -u -d 'yesterday' +%Y-%m-%d)" >> $GITHUB_ENV
            echo "end_date=$(date -u +%Y-%m-%d)" >> $GITHUB_ENV
            echo "replaced dates with defaults"
          else
            echo "start_date=${{ inputs.start_date }}" >> $GITHUB_ENV
            echo "end_date=${{ inputs.end_date }}" >> $GITHUB_ENV
            echo "set dates to inputs"
          fi

      - name: Validate dates
        id: validate_dates
        run: |
          if [[ "${{ env.start_date }}" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
            echo "start_date of ${{ env.start_date }} is valid"
          else
            echo "start_date of ${{ env.start_date }} is invalid"
            exit 1
          fi
          if [[ "${{ env.end_date }}" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
            echo "end_date of ${{ env.end_date }} is valid"
          else
            echo "end_date of ${{ env.end_date }} is invalid"
            exit 1
          fi

I was expecting start_date and end_date to be something like start_date=2023-05-22 end_date=2023-05-23 but they are both "" (as in: start_date of "" is invalid). What am I doing wrong?

Shubadecka
  • 23
  • 2
  • I believe your logic is backwards: `! [[ -z "${{ inputs.start_date }}" && -z "${{ inputs.end_date }}" ]]` is true when either is non empty. My guess is you meant `! [[ -n "${{ inputs.start_date }}" && -n "${{ inputs.end_date }}" ]]` which would be true if either was empty. Right now, when both are empty, you go into the `else` block, which clearly expects them to be set. – joanis May 24 '23 at 00:04
  • I think this is a good remark what @joanis commented. I've edited my answer to extend it with a discussion of the code and left some remarks and an example with suggested changes. – hakre May 24 '23 at 09:40
  • Thank you both, I've got my code running the way it should now! Quick follow up: what is the difference between -z and -n? Does -z evaluate to False if the string is non-empty and -n evaluates to True if the string is non-empty? – Shubadecka May 24 '23 at 16:06
  • @Shubadecka: Yes, both pair up `-z` is always `! -n` (and vice-versa). Here is some specification about those: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html - and which shell are you using? Its documentation also has this and often better to read than specs. – hakre May 24 '23 at 16:15
  • I'm using ubuntu-latest. I'll take a look at their docs. – Shubadecka May 26 '23 at 16:20

1 Answers1

1

Analysis

As the output of

echo "start_date of ${{ env.start_date }} is invalid"

which is

start_date of "" is invalid

and which shows that env.start_date parameter is "" (two times double quotes).

from the excerpt of the workflow you've posted in your question, it is set in the previous step.

as it is not null (zero length string, -z test), this is the set operation:

echo "start_date=${{ inputs.start_date }}" >> $GITHUB_ENV

Therefore, we can only say that it must be that inputs.start_date is also "".

Discussion and Recommendations

Handling input can be a delicate thing: It can be optional with default values, it needs to be verified so that the processing has it in the required form and the output of it can be stable and expected.

Your scripts already do something useful here to help with that: first validate the input, then process it. So you have already divided the overall problem into smaller problems which are more easy and stable to solve (this also refers to a strategy called divide and conquer, even the name of an algorithm in computer science¹).

However, we've distributed them over two steps. This needless division makes the input processing more complicated. The operation belongs together ("Set up dates" and "Validate dates" is always in a chain in that order, a strong sign) but they are divided across two step-s.

This creates a needless distance and distance makes things harder. For example, there is no need to store the environment variables unless they are valid. But separating the test from the input processing by putting the test in a later step already requires storing the environment variables as otherwise, the next step could not test them, but also if they are flawed. Now we have introduced data into the system we actually don't want there to be, so it will lead to extra work/handling (even if it doesn't make dirt, it will catch dust).

Now let's take all this, not change the overall order too much, just re-arrange it a little bit differently to ease it having it in one step. See the following plan:

  1. input (inputs context)
  2. processing (defaults, date-processing, validating, testing the format)
  3. output (printing to $GITHUB_ENV)

Keep it in memory: IPO - Input-Process-Output ².

Now see it in the following example code, with focus on the start_date parameter as the end_date parameter would be the same so left out for brevity at two places. This code is for reading, less for copy and pasting; it may also have other errors, I've never run it.

      - name: Set up dates
        id: set_dates
        run: |
          # [1/3] inputs and defaults 
          # start date:
          start_date="$(date -u -d "${{ inputs.start_date || "yesterday" }}" +%Y-%m-%d)"
          # end_date: ...
          # [2/3] validate format
          if [[ ! "$start_date" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
            printf "start_date: input: \"%s\" is invalid date: %s\n" "${{ inputs.start_date }}" "$start_date"
            exit 1
          fi
          # if [[ ! "$end_date ...
          # [3/3] write environment
          printf "start_date=%s\n" "$start_date" >> $GITHUB_ENV
          printf "end_date=%s\n" "$end_date" >> $GITHUB_ENV

or with just the shell script with the Microsoft Github Action Expressions ${{ ... }} to ease reading:

# [1/3] inputs and defaults 
# start date:
start_date="$(date -u -d "${{ inputs.start_date || "yesterday" }}" +%Y-%m-%d)"
# end_date: ...
# [2/3] validate format
if [[ ! "$start_date" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
    printf "start_date: input: \"%s\" is invalid date: %s\n" "${{ inputs.start_date }}" "$start_date"
    exit 1
fi
# if [[ ! "$end_date ...
# [3/3] write environment
printf "start_date=%s\n" "$start_date" >> $GITHUB_ENV
printf "end_date=%s\n" "$end_date" >> $GITHUB_ENV

One, two, three and done. No more setup of the environment if there is an error with the date inputs.

This is also called "exiting early"; that is the moment to know about a problem, at the place in the code, exit already with a non-zero status code (e.g. the number one in "exit 1" after printing the diagnostic message. We had this already earlier, only in the later, second step. This follows the principle to fail early or fail fast³, a useful principle/mnemonic/design to stabilize the functionality faster as errors are required to be handled earlier by the user. That is, it reduces learning times because you get feedback faster. Work with the errors, not against them (and be proud if you found some). This part is also more important now, as this is one step created from two, so more to do at once (but hopefully as little as necessary).

Some Remarks

It was already much text, but some additional remarks as I naturally write code differently than you, and I should comment on some place to ease reading and explain a bit why I wrote it this way:

  • "${{ inputs.start_date || "yesterday" }}" - this is having the input with a default value (here the word "yesterday" as string). Cf. "Example: Use of the or operator" in section Configuration of the Docker Registry in my answer to Docker invalid reference format (Tag Name) in Github Action for a description of the or (||) operator in Microsoft Github Action Expressions ${{ ... }}.

  • Every input is passed through the date(1) utility, not only for the default values "yesterday" or "today". This can make the intent more clear, but for me, the major benefit is that we treat all inputs equal. Every date needs to pass through date(1) so we have streamlined the processing and with it also the validation (and also more informative messages considering the inputs' error case of two double quotes: date: invalid date ‘""’) and also the formatting. Technically we are using date(1) as a filter, if we need to cover a similar functionality with the shell script alone, we could, but it would require a lot of work to achieve the same (if we are even capable, who knows, right?). And you already had the clever format for your dates, %Y-%m-%d, now we have them always, regardless of the user-input.

  • I'm using printf instead of echo as it allows me to use placeholders (e.g. %s for string parameters) and some escape sequences (e.g. \n for newline) in outputs. The reason is not that echo wouldn't work, this is just my current preference. Both do work here, use the one that works best for you, and any command always according to the docs. Stay curious.

And a Final Takeaway

Last but not least:

Reduce the number of paths the code can take. E.g. combine exiting early as discussed above with the "rule" to "never" use else or elif. When we prevent if/else and only use if, there still is conditional code (if we can get rid of that, golden but not always possible), but it is less cumbersome to treat one against the other, it is just "if" not "if or what if else and then what else?". Which code do you want to read when the build pipeline exits early next time unplanned and the diagnostic message looks unclear? Truly a rhetoric question.

hakre
  • 193,403
  • 52
  • 435
  • 836