Analysis
As the output of
echo "start_date of ${{ env.start_date }} is invalid"
which is
start_date of "" is invalid
and which shows that env.start_date
parameter is ""
(two times double quotes).
from the excerpt of the workflow you've posted in your question, it is set in the previous step.
as it is not null (zero length string, -z
test), this is the set operation:
echo "start_date=${{ inputs.start_date }}" >> $GITHUB_ENV
Therefore, we can only say that it must be that inputs.start_date
is also ""
.
Discussion and Recommendations
Handling input can be a delicate thing: It can be optional with default values, it needs to be verified so that the processing has it in the required form and the output of it can be stable and expected.
Your scripts already do something useful here to help with that: first validate the input, then process it. So you have already divided the overall problem into smaller problems which are more easy and stable to solve (this also refers to a strategy called divide and conquer, even the name of an algorithm in computer science¹).
However, we've distributed them over two steps. This needless division makes the input processing more complicated. The operation belongs together ("Set up dates" and "Validate dates" is always in a chain in that order, a strong sign) but they are divided across two step
-s.
This creates a needless distance and distance makes things harder. For example, there is no need to store the environment variables unless they are valid. But separating the test from the input processing by putting the test in a later step already requires storing the environment variables as otherwise, the next step could not test them, but also if they are flawed. Now we have introduced data into the system we actually don't want there to be, so it will lead to extra work/handling (even if it doesn't make dirt, it will catch dust).
Now let's take all this, not change the overall order too much, just re-arrange it a little bit differently to ease it having it in one step. See the following plan:
- input (
inputs
context)
- processing (defaults,
date
-processing, validating, testing the format)
- output (printing to
$GITHUB_ENV
)
Keep it in memory: IPO - Input-Process-Output ².
Now see it in the following example code, with focus on the start_date
parameter as the end_date
parameter would be the same so left out for brevity at two places. This code is for reading, less for copy and pasting; it may also have other errors, I've never run it.
- name: Set up dates
id: set_dates
run: |
# [1/3] inputs and defaults
# start date:
start_date="$(date -u -d "${{ inputs.start_date || "yesterday" }}" +%Y-%m-%d)"
# end_date: ...
# [2/3] validate format
if [[ ! "$start_date" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
printf "start_date: input: \"%s\" is invalid date: %s\n" "${{ inputs.start_date }}" "$start_date"
exit 1
fi
# if [[ ! "$end_date ...
# [3/3] write environment
printf "start_date=%s\n" "$start_date" >> $GITHUB_ENV
printf "end_date=%s\n" "$end_date" >> $GITHUB_ENV
or with just the shell script with the Microsoft Github Action Expressions ${{ ... }}
to ease reading:
# [1/3] inputs and defaults
# start date:
start_date="$(date -u -d "${{ inputs.start_date || "yesterday" }}" +%Y-%m-%d)"
# end_date: ...
# [2/3] validate format
if [[ ! "$start_date" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
printf "start_date: input: \"%s\" is invalid date: %s\n" "${{ inputs.start_date }}" "$start_date"
exit 1
fi
# if [[ ! "$end_date ...
# [3/3] write environment
printf "start_date=%s\n" "$start_date" >> $GITHUB_ENV
printf "end_date=%s\n" "$end_date" >> $GITHUB_ENV
One, two, three and done. No more setup of the environment if there is an error with the date inputs.
This is also called "exiting early"; that is the moment to know about a problem, at the place in the code, exit already with a non-zero status code (e.g. the number one in "exit 1
" after printing the diagnostic message. We had this already earlier, only in the later, second step. This follows the principle to fail early or fail fast³, a useful principle/mnemonic/design to stabilize the functionality faster as errors are required to be handled earlier by the user. That is, it reduces learning times because you get feedback faster. Work with the errors, not against them (and be proud if you found some). This part is also more important now, as this is one step created from two, so more to do at once (but hopefully as little as necessary).
Some Remarks
It was already much text, but some additional remarks as I naturally write code differently than you, and I should comment on some place to ease reading and explain a bit why I wrote it this way:
"${{ inputs.start_date || "yesterday" }}"
- this is having the input with a default value (here the word "yesterday
" as string). Cf. "Example: Use of the or operator" in section Configuration of the Docker Registry in my answer to Docker invalid reference format (Tag Name) in Github Action for a description of the or (||
) operator in Microsoft Github Action Expressions ${{ ... }}
.
Every input is passed through the date(1)
utility, not only for the default values "yesterday
" or "today
". This can make the intent more clear, but for me, the major benefit is that we treat all inputs equal. Every date needs to pass through date(1)
so we have streamlined the processing and with it also the validation (and also more informative messages considering the inputs' error case of two double quotes: date: invalid date ‘""’
) and also the formatting. Technically we are using date(1)
as a filter, if we need to cover a similar functionality with the shell script alone, we could, but it would require a lot of work to achieve the same (if we are even capable, who knows, right?). And you already had the clever format for your dates, %Y-%m-%d
, now we have them always, regardless of the user-input.
I'm using printf
instead of echo
as it allows me to use placeholders (e.g. %s
for string parameters) and some escape sequences (e.g. \n
for newline) in outputs. The reason is not that echo
wouldn't work, this is just my current preference. Both do work here, use the one that works best for you, and any command always according to the docs. Stay curious.
And a Final Takeaway
Last but not least:
Reduce the number of paths the code can take. E.g. combine exiting early as discussed above with the "rule" to "never" use else
or elif
. When we prevent if/else and only use if, there still is conditional code (if we can get rid of that, golden but not always possible), but it is less cumbersome to treat one against the other, it is just "if" not "if or what if else and then what else?". Which code do you want to read when the build pipeline exits early next time unplanned and the diagnostic message looks unclear? Truly a rhetoric question.