1

My Docker image is failing during build in GitLab CI and it fails silently without giving any errors to work with. I can build the image locally and no problem whatsoever so the problem is in CI environment. Something that is not obvious causes the build to fail. After doing some research about this I've learned the best thing to do to SSH into the CI server and "poke around" to find out what's happening. In particular I've learned that I can get a log of the last layer before the build fails to get insight into why it might be failing. However, GitLab doesn't support direct SSH connection into CI server. Supports only fixed SSH commands executed towards the server from the build environment (.gitlab-ci.yml) which isn't very helpful because I need to use SSH to access build layers of the image.

What are my other options as to how can I debug / analyze an image during build in CI ? Any feedback much appreciated.

Dockerfile:

###########
# BUILDER #
###########

# base image
FROM node:11.12.0-alpine as builder

# set working directory
WORKDIR /usr/src/app

RUN apk add --no-cache --virtual .gyp python make g++

# install app dependencies
ENV PATH /usr/src/app/node_modules/.bin:$PATH
COPY package.json /usr/src/app/package.json
COPY package-lock.json /usr/src/app/package-lock.json
RUN npm install --no-optional
RUN npm install react-scripts@2.1.8 -g --silent --no-optional

# set environment variables
ARG REACT_APP_USERS_SERVICE_URL
ENV REACT_APP_USERS_SERVICE_URL $REACT_APP_USERS_SERVICE_URL
ARG NODE_ENV
ENV NODE_ENV $NODE_ENV

# create build
COPY . /usr/src/app
RUN npm run build


#########
# FINAL #
#########

# base image
FROM nginx:1.15.9-alpine

# update nginx conf
RUN rm -rf /etc/nginx/conf.d

COPY conf /etc/nginx
# copy static files
COPY --from=builder /usr/src/app/build /usr/share/nginx/html

# expose port
EXPOSE 80

# run nginx
CMD ["nginx", "-g", "daemon off;"]

.gitlab-ci.yml file:

...
...

after_script:
  - bash ./docker-push.sh
  - docker-compose down

docker-push.sh script that builds the image for pushing into ECR on AWS:

    echo "building the client image ..."
    docker -D build $CLIENT_REPO -t $CLIENT:$COMMIT -f Dockerfile-prod --build-arg REACT_APP_USERS_SERVICE_URL=""  # this line is failing
    if [ $? -ne 0 ]; then
      echo "Failure. Exiting now..."
      exit 1
    fi
    docker -D tag $CLIENT:$COMMIT $REPO/$CLIENT:$TAG
    docker -D push $REPO/$CLIENT:$TAG

    docker build $USERS_REPO -t $USERS:$COMMIT -f Dockerfile-$DOCKER_ENV
    docker tag $USERS:$COMMIT $REPO/$USERS:$TAG
    docker push $REPO/$USERS:$TAG

    docker build $USERS_DB_REPO -t $USERS_DB:$COMMIT -f Dockerfile
    docker tag $USERS_DB:$COMMIT $REPO/$USERS_DB:$TAG
    docker push $REPO/$USERS_DB:$TAG

    docker build $SWAGGER_REPO -t $SWAGGER:$COMMIT -f Dockerfile-$DOCKER_ENV
    docker tag $SWAGGER:$COMMIT $REPO/$SWAGGER:$TAG
    docker push $REPO/$SWAGGER:$TAG

job log from gitlab ci (relevant part only):

Login Succeeded
building the client image ...
time="2020-04-14T08:54:23Z" level=debug msg="Skipping excluded path: .dockerignore"
time="2020-04-14T08:54:23Z" level=debug msg="Skipping excluded path: Dockerfile"
time="2020-04-14T08:54:23Z" level=debug msg="Skipping excluded path: Dockerfile-prod"
time="2020-04-14T08:54:23Z" level=debug msg="Skipping excluded path: Dockerfile-stage"
time="2020-04-14T08:54:23Z" level=debug msg="Skipping excluded path: .dockerignore"
time="2020-04-14T08:54:23Z" level=debug msg="Skipping excluded path: Dockerfile-prod"
time="2020-04-14T08:54:23Z" level=debug msg="Skipping excluded path: Dockerfile"
time="2020-04-14T08:54:23Z" level=debug msg="Skipping excluded path: Dockerfile-stage"
Step 1/25 : FROM node:11.12.0-alpine as builder
 ---> 09084e4ff58d
Step 2/25 : WORKDIR /usr/src/app
 ---> Using cache
 ---> 9c6639a8a785
Step 3/25 : RUN apk add --no-cache --virtual .gyp python make g++
 ---> Using cache
 ---> 0d5320ee514b
Step 4/25 : ENV PATH /usr/src/app/node_modules/.bin:$PATH
 ---> Using cache
 ---> c041f8c64b34
Step 5/25 : COPY package.json /usr/src/app/package.json
 ---> 02d18d67a517
Step 6/25 : COPY package-lock.json /usr/src/app/package-lock.json
 ---> 2d94e8e8fb6c
Step 7/25 : RUN npm install --no-optional
 ---> Running in 59660215041e
> cypress@4.1.0 postinstall /usr/src/app/node_modules/cypress
> node index.js --exec install
Installing Cypress (version: 4.1.0)
[08:55:20]  Downloading Cypress     [started]
[08:55:20]  Downloading Cypress      0% 0s [title changed]
[08:55:20]  Downloading Cypress      2% 5s [title changed]
...
...
[08:55:39]  Unzipping Cypress        9% 167s [title changed]
[08:55:39]  Unzipping Cypress        100% 0s [title changed]
[08:55:39]  Unzipped Cypress        [title changed]
[08:55:39]  Unzipped Cypress        [completed]
[08:55:39]  Finishing Installation  [started]
[08:55:40]  Finished Installation   /root/.cache/Cypress/4.1.0 [title changed]
[08:55:40]  Finished Installation   /root/.cache/Cypress/4.1.0 [completed]
You can now open Cypress by running: node_modules/.bin/cypress open
https://on.cypress.io/installing-cypress
added 2034 packages from 768 contributors and audited 38602 packages in 77.201s
found 1073 vulnerabilities (1058 low, 14 moderate, 1 high)
  run `npm audit fix` to fix them, or `npm audit` for details
Saving cache
00:02
Uploading artifacts for successful job
00:02
Job succeeded
mr_incredible
  • 837
  • 2
  • 8
  • 21
  • Can you please add your docker file, the part of the gitlab-ci.yml responsible for building and the output from the log? Since normally a docker build does not fail silently and it should be in the log of the job. If something fails in gitlab-ci all I normally do is add extra script statements to test hypotheses or of settings are correct (like an echo). So I cannot help you with the main question, but I might be able to help you with your problem never the less. – Dennis van de Hoef - Xiotin Apr 14 '20 at 14:48
  • Yes I know it is fairly rare for a Docker build to fail without giving any clue to the programmer. I added files. Just for the record other images you see in the file they build and push into ECR just fine. Client image is the mysterious one. – mr_incredible Apr 14 '20 at 16:10
  • I marked the line with comment where the build fails and as you can see the job log isn't giving any clue as to why. As you can see the script is building 4 images but the client image is failing. – mr_incredible Apr 14 '20 at 16:16
  • I don't know if you've spotted that when the image fails to build and push all the other images are skipped but when I move the code for building image to the bottom all 3 images are built and pushed but the client image fails without explanation. – mr_incredible Apr 14 '20 at 16:30
  • I've decided to raise issue on GitLab. This is clearly CI issue. – mr_incredible Apr 14 '20 at 20:32
  • Can you please also add the link to the issue, for future visitors. I noticed that the log only shows building steps up-to 7/25 and nothing further. Is this due to your cropping in the log? Never the less, I don't now node that good so I looked up the --no-optional flag and found this post where someone only uses it to speed up things wich is not needed in docker. In case you also add it for speed, you can should this: https://stackoverflow.com/questions/43790807/speed-up-npm-install-in-docker-container and consider switching to yarn. – Dennis van de Hoef - Xiotin Apr 15 '20 at 06:34
  • Issue raised here https://forum.gitlab.com/t/runner-exits-out-as-job-succeeded-before-finishing-a-job/36394 I did crop the output but only because it was outputting unzipping and building of Cypress which spans over like 60 lines so I shortened that and indicated by the dots. The `--no-optional` flag is for not to include OSX specific libraries. Some of the libraries are deprecated anyway and not safe to use. I'm on Ubuntu so I'm avoiding installing unnecessary libraries by using the flag. – mr_incredible Apr 15 '20 at 07:25

0 Answers0