I am running puppeteer on a server on kubernetes to generate images of html pages stored in the backend, exposed as a REST api. I am initializing a browser and reusing it for every request. The requests to this microservice come from another microservice where each image generation call is using await
running at regular intervals of time. This works fine for majority of the times, except, the amount of memory used by chromium keeps growing, and eventually the pod is restarted.
Here is the code of image generation
// checking if chrome is running or not
const isRunning = (query) => {
const platform = process.platform;
let cmd = '';
switch (platform) {
case 'win32': cmd = `tasklist`; break;
case 'darwin': cmd = `ps -ax | grep ${query}`; break;
case 'linux': cmd = `ps -A`; break;
default: break;
}
return execSync(cmd).toString('ascii').toLowerCase().indexOf(query.toLowerCase()) > -1;
};
// single browser instance for reuse and avoid new spawns
async function getBrowser() {
try {
browser = await puppeteer.launch({ headless: true, args: ['--no-sandbox', '--disabled-setupid-sandbox', '--single-process', '--no-zygote', '--disable-gpu', '--disable-dev-shm-usage'] });
console.log('Browser launched successfully');
return false;
} catch (error) {
console.log('retrying launching chrome');
return true;
}
}
// wait till a browser is successfully launched to avoid timeout
async function waitTillBrowser(){
while(await getBrowser());
}
// main code for image generation
.
.
.
if(!isRunning('chrome')){
console.log('browser was not obtained. retrying...');
await waitTillBrowser();
}
const page = await browser.newPage();
await page.setJavaScriptEnabled(false);
await page.setViewport({ width: CONFIG.IMAGE_PARAMS.VIEWPORT.WIDTH, height: CONFIG.IMAGE_PARAMS.VIEWPORT.HEIGHT });
await page.setContent(data.html);
image = await page.screenshot({type: CONFIG.IMAGE_PARAMS.ENCODING, quality:CONFIG.IMAGE_PARAMS.QUALITY});
await page.close();
.
.
.
I didn't consider having the requests pushed to a queue and then consuming it as the image generation api isn't exposed to the user, so the amount of requests can be controlled. Also I didn't consider other libraries(like playwright
) as they basically do the same thing, so I suspect I might run into the same problems there as well, similar case with puppeteer-cluster
.
I am considering running a script to check if the memory consumed by chrome is above a certain limit, at which point the process will be killed. This just works for my case but isn't the right way to do this, are there any other approaches to this?