6

With gitlab-ci I am using a simple .yml file. I have defined various stages to run synchronously. I have set a cache for node_modules. But the problem is that the cache of node_modules is actually slowing down the process. This cache is required to make the node_modules the same across each stage. (Each stage automatically clears /node_modules for some reason)

When building locally this whole process takes less then 2 minutes. But on the CI machine this process takes between 20 and 25 minutes. Learning how Gitlab CI works internally, I've learned that it's zipping the node_module files (about 36K small files) and that process is extremely slow.

tl;dr: What is the proper way to handle node_module caching with Gitlab CI without uploading node_modules to artifacts? I would like to avoid uploading artifacts that are over 400MB large.

See configuration below:

cache:
  untracked: true
  key: "%CI_COMMIT_REF_NAME%"
  paths:
    - node_modules

stages:
  - install
  - eslint-check
  - eslint
  - prettier
  - test
  - dist

# install dependancies
install:
  stage: install
  script:
    - yarn install
  environment:
    name: development

# run eslint-check
eslint-check:
  stage: eslint-check
  script:
    - yarn eslint-check
  environment:
    name: development

# Other scripts below
Perfection
  • 721
  • 4
  • 12
  • 36
  • 1
    Using [untracked: true](https://docs.gitlab.com/ee/ci/yaml/#cache-untracked) would cache *every* untracked file in the build directory, not only the `node_modules` I believe. If you remove the `untracked: true` it might have fewer files to cache? – Rekovni Jul 16 '18 at 15:01
  • @Rekovni I have overlooked that it actually counts for every untracked file rather than the glob I specify. There shouldn't be so many after an install. But it's certainly worth investigating this. – Perfection Jul 16 '18 at 15:28
  • And it didn't impact any performance, unfortunately as untracked files are rather well controlled in this project. – Perfection Jul 16 '18 at 15:32
  • Does your `node_modules` change every time you run it? or is it mostly the same? also do you know what machine you're running on, or is it different every run? – Rekovni Jul 16 '18 at 16:12
  • We try and keep module versions up to date to keep us so we can find migration issues early and not late. We have a Gitlab runner on a VM server that runs these tasks on a windows machine. – Perfection Jul 17 '18 at 15:02
  • Do you have one or more runners building this? If you only have one runner there could be a hacky way of working around using the cache/artifacts... – Rekovni Jul 17 '18 at 16:25
  • We have 2 runners but we're looking at getting more. – Perfection Jul 18 '18 at 07:32

1 Answers1

2

It would appear that there will be a solution for this in the future as the issue has been discussed here for almost two years. A milestone has been set so this can be resolved eventually.

https://gitlab.com/gitlab-org/gitlab-runner/issues/1797

Perfection
  • 721
  • 4
  • 12
  • 36