I am trying to read all AAD related questions and answers from Stack Exchange API /2.2/search/advanced/pagesize=100&fromdate=2019-07-01&todate=2020-10-19&site=stackoverflow&filter=!BLIw93LDFyFBUjlepdSTkMo7r6Pkpx&q=listOfTags
by passing set of tags, since we are trying to get the data from July 1st 2019.
Our ADF pipeline keeps getting throttlede and even if we set the wait time for 1 minute and our ETL is very slow, it's running forever.
Current Approach (very slow)
I am using ADF to Pull the all the questions (iterating through page by page using until activity) which meets the tags and load the data into SQL
Pass the question id to this API https://api.stackexchange.com/docs/answers-on-questions#order=desc&sort=activity&ids=29433422&filter=!0U7YRMKgNJq(Exonzn(PdiZE5&site=stackoverflow&run=true to get all the answers for respective question and then load the result into SQL.
Questions:
Is there a direct back-end (Kusto or SQL or cosmos etc.) we can get the data than calling the API to get the question and answers? If so how do we get the access to the back-end?
What is the efficient approach to pull the historical data without throttling from Stack Overflow?