0

I'm working on a project where we sync data from a 3rd party's API call as we used their software as a base when initially building our platform. The call we make esentially gives us a return for all the tables in a login in their software so we can build our own local caching system to provide faster response calls and then create custom manipulation etc.

This was fine for a long time as we could download and boot up all the client caches easily. However now when we build it'll timeout, fail, or take 20-30 minutes per client (We have a few clients to get through). Considering for testing and development we start up local instances of this cache manager too this is becoming un-manageable.

To know that is an issue with our office IP loading up the instance, if we build the software, containerize it in docker and then push it to AWS, it runs fine. If we build a RestSharp instance of the download, push it to Repl.it, it also downloads in about 8 seconds per client. If we hotspot to our local machines, it takes about 1-2 minutes per client instead of 30.

My belief is that the Azure/Aws instance that this 3rd party has their software in, is assuming we are requesting too much too often and is slowing us down for resource handling.

The owner of this 3rd party stands that they aren't doing anything deliberately and won't investigate as it's not worth the time.

Here are the ideas we've thought through

  • Hotspot (not viable all the time too much data)
  • VPN (Costly and if you get a bad server, too slow)
  • Remove Static IP From Router (Could be an issue with remote and other tools in the future)
  • Build a function from an AWS / Azure / GCP Instance that handles the download and passes it on (This requires Dev Time, considering the sync function used is built from the 3rd Party).

In the future we're moving 100% away from them but in the meantime I need a reliable creative solution to help with getting the instances up and running reliably but quickly. Any ideas would be well appreciated.

Thanks

  • Have you experimented with limiting the concurrency (degree of parallelism) or the rate (requests per time unit) of interacting with the 3rd party API? – Theodor Zoulias Dec 13 '21 at 22:36
  • Unfortunately the sync function they have provided is pretty locked down as it handles both inserting entries into the tables as well as the download (We simply provide a class object to the function). We can handle the request for download manually, however no matter how I play with the rest sharp request it still downloads slow / not at all from our network. – Jesse Hayward Dec 13 '21 at 22:51
  • I should add that the cache manager downloads the result as a stream in a single request, it doesn't do blocks etc, nor does it handle the ability for blocks it simply asks for a lastbuilddate and then provides the response in a single response stream per client, we're limiting to building 1 client so we only send 1 request however this still doesn't help – Jesse Hayward Dec 13 '21 at 22:54
  • I am talking about simple throttling, by using a [`SemaphoreSlim`](https://learn.microsoft.com/en-us/dotnet/api/system.threading.semaphoreslim) for [limiting the concurrency](https://stackoverflow.com/questions/10806951/how-to-limit-the-amount-of-concurrent-async-i-o-operations) for example. How does the 3rd party API respond to that? – Theodor Zoulias Dec 14 '21 at 00:05

0 Answers0