Puppeteer version: 5.5
Platform / OS version: Docker container deployed on Heroku
Node.js version: latest
I have set up puppeteer scripts that live in a docker container, so that they can be run with Xvfb, a simulated UI so that the scripts can be run in headless:false
mode. The site they crawl causes this to be a requirement.
The docker container builds fine, and the image runs locally. I can run the scripts and the Xfvb simulated display launches and allows the browser to launch as intended. The scripts perform as expected in my local docker image.
Ive deployed this to Heroku and spun up a dyno, and have tried to run the scripts on the dyno. The dyno runs fine, the server launches and listens for requests. When I try to run the node scripts, It seems that the browser isnt able to connect to or launch the simulated display.
Xvfb: https://www.x.org/releases/X11R7.6/doc/man/man1/Xvfb.1.xhtml
Error when trying to run scripts on the heroku dyno
steveszumski@Steves-MacBook-Pro crawler % heroku run node crawler.js
Running node crawler.js on ⬢ suresale-crawler... up, run.3678 (Free)
Running getReport Headless Script
Error: Failed to launch the browser process!
Fontconfig warning: "/etc/fonts/fonts.conf", line 100: unknown element "blank"
[15:15:1211/184429.509801:ERROR:browser_main_loop.cc(1439)] Unable to open X display.
[1211/184429.534086:ERROR:nacl_helper_linux.cc(307)] NaCl helper process running without a sandbox!
Most likely you need to configure your SUID sandbox correctly
TROUBLESHOOTING: https://github.com/puppeteer/puppeteer/blob/main/docs/troubleshooting.md
at onClose (/app/node_modules/puppeteer/lib/cjs/puppeteer/node/BrowserRunner.js:193:20)
at ChildProcess.<anonymous> (/app/node_modules/puppeteer/lib/cjs/puppeteer/node/BrowserRunner.js:184:79)
at ChildProcess.emit (node:events:388:22)
at Process.ChildProcess._handle.onexit (node:internal/child_process:284:12)
Section of crawler.js that errors
try {
const browser = await puppeteer.launch({
headless: false,
args: [
'--no-sandbox'
]
});
const page = await browser.newPage();
Dockerfile
FROM node:latest
# update and add all the steps for running with xvfb
RUN apt-get update &&\
apt-get install -yq gconf-service libasound2 libatk1.0-0 libc6 libcairo2 libcups2 libdbus-1-3 \
libexpat1 libfontconfig1 libgcc1 libgconf-2-4 libgdk-pixbuf2.0-0 libglib2.0-0 libgtk-3-0 libnspr4 \
libpango-1.0-0 libpangocairo-1.0-0 libstdc++6 libx11-6 libx11-xcb1 libxcb1 libxcomposite1 \
libxcursor1 libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 libxrender1 libxss1 libxtst6 \
ca-certificates fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils wget \
xvfb x11vnc x11-xkb-utils xfonts-100dpi xfonts-75dpi xfonts-scalable xfonts-cyrillic x11-apps \
&& apt-get install -y wget gnupg \
&& wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
&& sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
&& apt-get update \
&& apt-get install -y google-chrome-stable fonts-ipafont-gothic fonts-wqy-zenhei fonts-thai-tlwg fonts-kacst fonts-freefont-ttf libxss1 \
--no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
RUN apt-get install bash
RUN apt-get install python
RUN apt-get install curl
RUN apt-get install openssh-client
#remove zombie images
ADD https://github.com/Yelp/dumb-init/releases/download/v1.2.2/dumb-init_1.2.2_x86_64 /usr/local/bin/dumb-init
RUN chmod +x /usr/local/bin/dumb-init
ENTRYPOINT ["dumb-init", "--"]
WORKDIR /app
# using /tmp in order to read/write to container fs as non-root user
# WORKDIR /tmp
# add the required dependencies - this breaks the build process for some reason, manually npm install below works
# COPY node_modules /app/node_modules
RUN npm install puppeteer \
# Add user so we don't need --no-sandbox.
# same layer as npm install to keep re-chowned files from using up several hundred MBs more space
&& groupadd -r pptruser && useradd -r -g pptruser -G audio,video pptruser \
&& mkdir -p /home/pptruser/Downloads \
&& chown -R pptruser:pptruser /home/pptruser \
&& npm install express \
&& npm install basic-ftp
# && chown -R pptruser:pptruser /node_modules
# For Heroku exec
RUN rm /bin/sh && ln -s /bin/bash /bin/sh
ADD ./.profile.d /app/.profile.d
EXPOSE 3000
# Run everything after as non-privileged user.
USER pptruser
# USER root
# Finally copy the build application
COPY . .
# make sure we can run without a UI
ENV DISPLAY :99
CMD Xvfb :99 -screen 0 1024x768x16 -nolisten unix & node server.js