0

So I'm currently using socket.io for a loading screen and it just isn't working so I'm going to try a different solution. I need the following to happen: user opens up my app--this triggers a function on the server side (CPU intensive scraping function that takes about 8 seconds). As soon as function is done, I need to notify the client that the function is done and the client fetches the results (after this result is fetched the app loading screen closes and the user enters the app).

My current implementation is using socket.io emissions and processing the scraping function via Redis queue on a background thread.

Is there a different approach to doing this instead of a socket connection and sending emissions from client -> server and vice versa?

I don't think I can just to a classic GET request, because isn't it bad practice to leave a long running (8 second) request open for so long while waiting for response? Also, I looked into Server-Sent Events but I'm not sure I would be able to use it in this case.

Any insight is much appreciated!

nickcoding2
  • 142
  • 1
  • 8
  • 34
  • 1
    FWIW, socket.io works by "leaving a long-running request open." When your client app connects with socket.io it uses an ordinary HTTPS connection that's then "upgraded" to a websocket connection and left open. So, if socket.io works for you it's fine. – O. Jones Nov 17 '21 at 13:24
  • @O.Jones Yeah that's the problem...socket.io causes me no small amount of grief in the form of transport close and transport error issues so I'd rather just use a different implementation if possible. – nickcoding2 Nov 17 '21 at 13:43
  • The tags on your question are confusing. What do [tag:redis] and [tag:swift] have to do with this problem? I guess you do this scraping from within your nodejs server app. What library do you use for that? Please [edit] your question or ask another. – O. Jones Nov 17 '21 at 13:48
  • I use SerpAPI for the scraping, and I also use some geocoding API function calls. Redis is currently used on the Node.js backend for queuing up the scraping requests, and Swift is what is used on the front end (client side). – nickcoding2 Nov 17 '21 at 15:31

2 Answers2

1

8 seconds is longer than normal for a response from a "healthy" server, but still should be fine. I'd just use a GET from the mobile client, and write it to not block the UI while it is waiting.

Write your UI so that it informs the user that it is waiting for a response, and maybe even give the user an idea of how long the response is likely to take.

Duncan C
  • 128,072
  • 22
  • 173
  • 272
  • Any way to do it better? Also shouldn't I be offloading the scraping process to worker thread anyway for better scalability in the long term? – nickcoding2 Nov 17 '21 at 13:13
  • Are you asking about the setup of your server process? I'm an iOS engineer, and was addressing the mobile side of the question. I'm not qualified to answer your question about the Node.js implementation. – Duncan C Nov 17 '21 at 14:57
  • Oh okay, so from the iOS side, if I just do a classic HTTP GET request, and then in the callback of URLSession.shared.dataTask(), I use the server response inside a DispatchQueue, that is non-blocking, correct? – nickcoding2 Nov 17 '21 at 15:29
  • Mostly correct. The network transaction is ALWAYS async and non-blocking. By default, the delegate methods/closures are called on a background queue, so you can do time-consuming work without blocking the main thread. Thus you don't have to do anything special to avoid blocking the main thread while you process the response. (No need to create a DispatchQueue) – Duncan C Nov 17 '21 at 18:16
1

A few points to keep in mind.

  1. nodejs can usually do a lot of concurrent web-scraping: most web-scraping elapsed time is spent waiting for the scraped servers to respond. nodejs is asynchronous. So worker threads may not help much. They definitely will make your server app more complex. A good choice for scaling this out might be clustering your nodejs app.
  2. There's no harm in writing your server to stall the response to a GET request for a few seconds, except harm to the user experience. If your user has something reasonable to look at for those few seconds, you can probably get away with this.
  3. If your user is looking at a web page during that stall, you can use xhr or fetch from Javascript code in the page to retrieve that data while you entertain them with a spinner or some such user interface stuff.
O. Jones
  • 103,626
  • 17
  • 118
  • 172
  • Okay, so I'll do some testing as just writing it as a GET request, and I'll look into possibly using clustering. The user just looks at a loading screen while the request is happening. – nickcoding2 Nov 17 '21 at 15:33
  • Also, my scraping function is an async function with lots of awaits because I need the code to execute in a specific order because some API calls are dependent on the ones that come before. Does this mean it will block? – nickcoding2 Nov 17 '21 at 15:52
  • `async / await` is definitely friendly to nodejs asynchronous operation. The nasty kind of blocking crops up when you have CPU-intensive computations, not when nodejs is awaiting the completion of some network request. The only thing keeping you from running many API lookups concurrently is the concurrency limit of your API subscription plan. (1000 requests per hour for the dev plan on [SerpAPI](https://serpapi.com/#features),) – O. Jones Nov 17 '21 at 16:16