0

I am trying to build a database with all the emails.

But I get the Error:

ErrorServerBusy: The server cannot service this request right now. Try again later.

Is there any way to work with the throttling policy of ews? One month of emails do work but when I exceed some not known barrier it gets interrupted. Are there any other ways to prevent the throttling policies? I thought about implementing time.sleep(), but how could I find out how for how long I need to wait after how many emails to make it work?

shared_postboxes= [some accounts here]
credentials = Credentials(username=my username, password=my password)
config = Configuration(retry_policy=FaultTolerance(max_wait=600), credentials=credentials)

for shared_postbox in tqdm(shared_postboxes):

    account = Account(shared_postbox, credentials=credentials, autodiscover=True)
    top_folder = account.root
    email_folders = [f for f in top_folder.walk() if isinstance(f, Messages)]

    for folder in tqdm(email_folders):
    
        for m in folder.all().only('text_body', 'datetime_received',"sender").filter(datetime_received__range=(start_of_month,end_of_month), sender__exists=True).order_by('-datetime_received'):
        
            try: 
                senderdomain = ExtractingDomain(m.sender.email_address)
            
            except:
                print("could not extract domain")
        
            else:
                if senderdomain in domains_of_interest: 

                    postboxname = account.identity.primary_smtp_address
                    body = m.text_body
                    emails.append(body)
                    senders.append(senderdomain)
                    postbox.append(postboxname)
                    received.append(m.datetime_received)
    account.protocol.close()
Lehas123
  • 21
  • 5

1 Answers1

1

You created a Configuration object that defines a retry policy, which is what you want to solve your issue. But you never passed the configuration to your Account object. To do that, create your account as:

account = Account(shared_postbox, config=config, autodiscover=True)
Erik Cederstrand
  • 9,643
  • 8
  • 39
  • 63
  • oh damn, thanks for that hint! are there any other wise I could improve the speed of it? Cause as of right now I'm running this code on one mailbox for the mails in the last three months and it is taking longer than 30 minutes (still not finished) but mostly because the server already requested 5 times to back off for 296 seconds. Is it needed to wait 296 seconds or could I maybe reduce the time to wait and if so how? – Lehas123 Nov 17 '22 at 18:58
  • after three hours of running the code, the code got interrupted and it displayed me the following error: ErrorMailboxStoreUnavailable: The mailbox database is temporarily unavailable., The process failed to get the correct properties. How can I handle such errors in future so that my code doesnt break but just waits or continues to the next mail? – Lehas123 Nov 17 '22 at 21:22
  • Some things you could try to improve performance: 1) skip autodiscover if all mailboxes are located the same server, 2) Use a `FolderCollection` to query multiple folders in a single query (see https://github.com/ecederstrand/exchangelib/issues/848#issuecomment-762196147) – Erik Cederstrand Nov 17 '22 at 22:50
  • It's the server requesting the 10 minute backoff. I'd not advise to reduce that on your own - it's bad behavior, and you risk getting even more rate-limiting policies thrown at you. – Erik Cederstrand Nov 17 '22 at 22:51
  • You cannot easily add `ErrorMailboxStoreUnavailable` to the list of exceptions to retry on, unfortunately. I would probably add a try/except in your own code and retry after sleeping a bit. – Erik Cederstrand Nov 18 '22 at 11:56