I am doing load testing i don't want the server to crash which it was doing earlier when i was launching separate puppeteer instances and was trying to run two queries each of them fetching 100mb of data from mysql db, when i ran single puppeteer it went fine, so i researched and came across puppeteer-cluster, now basically i need to run cron every minute and there is no limit to what the number of no of requests per cron each minute, so what is the best practice here?
Option 1:
// Launch cluster outside cron
const cluster = await Cluster.launch({
concurrency: Cluster.CONCURRENCY_CONTEXT, // use one browser per worker
maxConcurrency: 4, // cluster with four workers
timeout: 3600000 // 1 Hour - worst case scenario
});
nodeCron.schedule("* * * * *", async () => {
for (const items of items) {
// Get item specific url
cluster.queue(({ page, url }) => {
await page.goto(url, { timeout: 3600000 });
});
}
});
Option 2:
nodeCron.schedule("* * * * *", async () => {
// Launch cluster inside cron function
const cluster = await Cluster.launch({
concurrency: Cluster.CONCURRENCY_CONTEXT, // use one browser per worker
maxConcurrency: 4, // cluster with four workers
});
for (const items of items) {
// Get item specific url
cluster.queue(({ page, url }) => {
await page.goto(url, { timeout: 3600000 });
});
}
await cluster.idle();
await cluster.close();
});
I am presently following Option 2, i am having problem understanding if i have to use option one will i have to keep my browser open indefinitely i.e when i will close it
- Note there can be any number of items every minute with no upper limit and and each item can go on upto 1 hour maximum as some worst case limit i put there since the request can load huge amounts of data ~ 250mb (should be in chunks but i cant change present architecture so this sometime gives no response for larger db tables)