1

I got AWS CDK application in typescript and pretty simple gitlab CI/CD pipeline with 2 stages, which takes care of the deployment:

image: node:latest

stages:
  - dependencies
  - deploy

dependencies:
  stage: dependencies
  only:
    refs:
      - master
    changes:
      - package-lock.json
  script:
    - npm install
    - rm -rf node_modules/sharp
    - SHARP_IGNORE_GLOBAL_LIBVIPS=1 npm install --arch=x64 --platform=linux --libc=glibc sharp
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules
    policy: push

deploy:
  stage: deploy
  only:
    - master
  script:
    - npm run deploy
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules
    policy: pull

npm run deploy is just a wrapper for the cdk command.

But for some reason, sometimes it happens, that the cache of the node_modules (probably) expires - simply deploy stage is not able to fetch for it and therefore the deploy stage fails:

Restoring cache
Checking cache for ***-protected...
WARNING: file does not exist                       
Failed to extract cache

I checked that the cache name is the same as the one built previously in the last pipeline run with dependencies stage.

I suppose it happens, as often times this CI/CD is not running even for multiple weeks, since I contribute to that repo rarely. I was trying to search for the root causes but failed miserably. I pretty much understand that cache can expire after some times(30 days from what I found by default), but I would expect CI/CD to recover from that by running the dependencies stage despite the fact package-lock.json wasn't updated.

So my question is simply "What am I missing? Is my understanding of caching in Gitlab's CI/CD completely wrong? Do I have to turn on some feature switcher?"

Basically my ultimate goal is to skip the building of the node_modules part as often as possible, but not failing on the non-existent cache even if I don't run the pipeline for multiple months.

Strojmistr
  • 13
  • 2

1 Answers1

1

A cache is only a performance optimization, but is not guaranteed to always work. Your expectation that the cache might be expired is most likely correct, and thus you'll need to have a fallback in your deploy script.

One thing you could do is that you change your dependencies job to:

  • Always run
  • Both push & pull the cache
  • Shortcircuit the job if the cache was found.

E.g. something like this:

dependencies:
  stage: dependencies
  only:
    refs:
      - master
    changes:
      - package-lock.json
  script:
    - |
      if [[ -d node_modules ]]; then
        exit 0
      fi
    - npm install
    - rm -rf node_modules/sharp
    - SHARP_IGNORE_GLOBAL_LIBVIPS=1 npm install --arch=x64 --platform=linux --libc=glibc sharp
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules

See also this related question.

If you want to avoid spinning up unnecessary jobs, then you could also consider to merge the dependencies & deploy jobs, and take a similar approach as above in the combined job.

Tiddo
  • 6,331
  • 6
  • 52
  • 85