0

I need to get a list of items from an API endpoint (/products), but they are paginated (max of 200 items per page).

I need to make a loop which will get 200 products, push to an array, and increase the page number, so it can ask for more 200 products. It will stop when there's a 404 error (page does not exist), meaning that I got all the products.

I'm using Axios for the requests, which is Promise-based, but I can't get it to work. I've tried several things, even creating my own Promises, but the results are the same:

  • Can't make it wait for all the pages to be requested
  • It will always request the same page because the page increment is inside .then of the promise (to be sure that I won't go beyond the last page)

I know that the idea of Promises is to be async, but I'm trying to find a way to make it work.

Anyone has any idea of what logic could I use to do this? I just want to get all the items before moving on. Maybe I'm overcomplicating, some clarification would help a lot.

EDIT:

Tried making it recursively, but the result always return before the execution:

module.exports = {
  sync(req, res) {
    // Get all products page by page
    var products = module.exports.getProductsByPage()
    res.status(200).send(products)
  },

  getProductsByPage(page = 1, products = []) {
    nuvemshop.get(`products?page=${page}&per_page=200`)
    .then(res => {
        console.log('GET Products PAGE ' + page)
        products.push(res.data)
        arguments.callee(++page, products)
    })
    .catch(e => {
        if(e.response.status === 404) {
            console.log('LAST PAGE REACHED')
            return products
        } else 
            return e
    })
  },
Deeh
  • 307
  • 1
  • 6
  • 17
  • you could execute the same function from inside `.then` block and give it the next url, I don't have an api with that much data to test it and give an example code but I hope you understood my point – Mohd_PH Feb 05 '18 at 20:13
  • I tried making it recursive but the result always return before all executions – Deeh Feb 05 '18 at 20:15
  • Added a code sample – Deeh Feb 05 '18 at 20:23
  • I am going to throw some ideas it maybe the issue but I am not sure, maybe add a property named `products` in your exported object. then in `.then` block call it like `this.products.push.res.data` – Mohd_PH Feb 05 '18 at 20:30
  • Ok now I saw this error: 'caller', 'callee', and 'arguments' properties may not be accessed on strict mode functions or the arguments objects for calls to them. What should I use then? – Deeh Feb 05 '18 at 20:32
  • I changed it to the function name itself, it starts getting all the products correctly but still it returns before. – Deeh Feb 05 '18 at 20:35
  • I guess you can call it like `this.getProductsByPage(++page, products)` – Mohd_PH Feb 05 '18 at 20:35
  • Did that now, as I said before: module.exports.getProductsByPage(++page, products). I have to use module.exports because I'm exporting it. But as I said, it is returning right away empty. – Deeh Feb 05 '18 at 20:38

1 Answers1

1

Does the following work or does that give errors/unexpected results?

const getProductsByPage = (page = 1, products = []) => 
  //you need to return the promise here, arrow without {} block
  //  returns the single statement (x=>y = function(x){return y;})
  nuvemshop.get(`products?page=${page}&per_page=200`)
  .then(res => {
    console.log('GET Products PAGE ' + page);
    //you need to return the promise here
    //  call recursively
    return getProductsByPage(
      page+1,
      products.concat(res.data)
    );
  })
  .catch(e => {
    if (e.response.status === 404) {
      console.log('LAST PAGE REACHED')
      return products
    } else
      return e
  });

const sync = (req, res) => {
  // Get all products page by page
  var products = getProductsByPage()
  .then(
    products=>
      res.status(200).send(products)
  ).catch(
    err => 
      res.status(500).send(err)
  );
};


module.exports = {
  sync
}

The following is a version that will fetch 10 pages at one time instead of one by one. It will produce a Fail type result if something goes wrong and remove the Fail types for 404 responses but any other reasons for failure will be saved:

const Fail = function(reason){this.reason = reason;};
const isFail = x=>(x&&x.constructor)===Fail;
const isNotFail = x=>!isFail(x);
const getProductsByPage = (pagesInSet=10) => {
  //set up max 3 requests per second
  //const max3 = throttlePeriod(3,1000);
  const recur = (startAt,products) =>
    Promise.all(
      Array.from(new Array(pagesInSet),(_,index)=>index+startAt)
      .map(
        page=>
          //for throttled
          //max3(nuvemshop.get.bind(nuvemshop))(`products?page=${page}&per_page=200`)
          nuvemshop.get(`products?page=${page}&per_page=200`)
          .catch(
            e=>new Fail(e,page)
          )
      )  
    ).then(
      resultSet=>{
        //check if any results came back with 404
        const needStop = resultSet
          .filter(isFail)
          .filter(
            fail=>{
              const [e] = fail;
              return e.response.status === 404;
            }
          );
        if(needStop.length!==0){
          const without404 = products.concat(
            resultSet.filter(
              result=>{
                if(isFail(result)){
                  const [e] = result;
                  return e.response.status !== 404;
                }
                return true;
              }
            )
          );
          return without404;
        }
        //get the next pagesInSet pages
        return recur(startAt+pagesInSet,products.concat(resultSet));
      }
    );
  return recur(1,[]);
}
const sync = (req, res) => {
  // Get all products in sets of 10 parallel requests
  var products = getProductsByPage(10)
  .then(
    products=> {
      //you may want to do something with failed request (non 404)
      //const failed = products.filter(isFail)
      res.status(200).send(products);
    }
  ).catch(
    err => 
      res.status(500).send(err)
  );
};

module.exports = {
  sync
}
HMR
  • 37,593
  • 24
  • 91
  • 160
  • Apparently it is working, I made a quick test now. I just couldn't see the results because my Postman crashed, there are so many items in the array haha. I'll test it again tomorrow. Some questions, why should I return it as a single statement? Is it a shortcut for return new Promise? And can't I encapsulate everything inside module.exports? – Deeh Feb 06 '18 at 01:22
  • Oh and I'm so dumb, I was trying to make it recursive and I totally forgot the return statement, I was just calling the function itself without return. – Deeh Feb 06 '18 at 01:24
  • @Deeh I like to call `getProductsByPage ` instead of `arguments.callee` and arrow functions do not have `arguments`. The single statement is that the functions need to return a promise `x=>y` returns y because it's the same as `x=>{return y;}` or `function(x){return y;}`. If you are making a lot of requests then you may think about [parallel throttling and catching railed request](https://stackoverflow.com/a/48604612/1641941) – HMR Feb 06 '18 at 01:29
  • @Deeh Added parallel example – HMR Feb 06 '18 at 02:06