-1

We have a program that keeps a local copy of a potentially large number of mailboxes by monitoring configurable journal streams through the Exchange Web Service Managed API. These journal streams give us an efficient way to ingest the incoming/outgoing mail for thousands of users.

Periodically we need to check if specific mail messages are still present in Exchange.

We currently do this by using FindItems filtered by the InternetMessageID (which we can get from the envelope in the journal):

var internetMessageIds = new [] {
    "<8ADA5FF6B7C3AB4B8F9BC5AF78206B42010AD1A3@....>",
    "<04A2D0BEB32B69458296B3E48F75732D014F601E@...>",
    "<F04A6B3F2A8AFC42BF4AC18C596A9810749CC4@...>"
};
var filter = new SearchFilter.SearchFilterCollection(
    LogicalOperator.Or, 
    internetMessageIds.Select(imid => 
        new SearchFilter.IsEqualTo(EmailMessageSchema.InternetMessageId, imid)
    )
);

var exchange = new ExchangeService(PreAuthenticate = true, Credentials = ...);
exchange.ImpersonatedUserId = new ImpersonatedUserId(ConnectingIdType.SmtpAddress, mailbox);

var folderView = new FolderView(1000) { 
    Traversal=FolderTraversal.Deep, 
    PropertySet=BasePropertySet.IdOnly 
};
var folders = exchange.FindFolders(WellKnownFolderName.Root, folderView)
    .Where(f => f.DisplayName == "AllItems" || f.DisplayName == "Deletions" || f.DisplayName == "Purges");
if (folders.Count() != 3) {
    // fall back to searching all folders
    folders = exchange.FindFolders(WellKnownFolderName.Root, folderView);
}
foreach (var folder in folders) {
    var findItemsResult = exchange.FindItems(folder.Id, filter, new ItemView(10));
    foreach (var itemResult in findItemsResult) {
        Console.WriteLine(
            "InternetMessageID={0}, UniqueId={1}", 
            ((EmailMessage)itemResult).InternetMessageId, 
            itemResult.Id.UniqueId
        );
    }
}

As you can see we loop through multiple folders and call FindItems on each.

Initially we always looped through all folders, but we recently optimized the code to use the All Items search folder (if it exists).

That optimization made our "does this message still exist in Exchange?" check faster, but it is still too expensive to run against any non-trivial number of messages.

  • Checking 3 messages takes about 3 seconds (after establishing the connection and impersonation) when the All Items folder exists.
  • Compare that to using BindItems with the UniqueID, which takes less than half a second in my tests.
    • But the UniqueID of a message is not available from the envelope in the journal stream.
    • And even if it were, the UniqueID of a message changes when the user moves it to a different folder in their mailbox.

Is there a more efficient way to check if a message we've captured from the journal still exists in the recipient's mailbox in Exchange?

Frank van Puffelen
  • 565,676
  • 79
  • 828
  • 807

1 Answers1

1

Can you describe the frequency that you need to check existence? Can you give the number of mailboxes and the journal size in number of items? What triggers an existential check (property value, time/date, etc)? When does that trigger occur?

Here is what I might investigate:

  1. eDiscovery - if you have this, then I'd first try the SearchMailboxes operation. You can search many mailboxes in a single call. Having to make one call per mailbox is what is really slowing you down. The query structure is based on AQS, or it might be KQL. Unfortunately, I don't think you can search on InternetMessageID (I'm not sure how or if that is indexed, which may be the reason why this is performs the way it does). Perhaps there is someway you can use the value of the other properties (perhaps a combo of properties). Once you have a preliminary result set, then you could use your filter above to confirm whether those items are a match based on InternetMessageID.
  2. If you don't have eDiscovery (E2013+), I'd try searching on some combination of properties extracted from the journal (particularly the text properties since they are usually indexed), and then if you have hits, perform the BindToItems.
  3. If you know ahead of time what items need to be tracked, you could use notifications to monitor mailbox items. This way, you'd know when an item no longer exists in mailbox.
  4. The Search-Mailbox cmdlet should be looked at.

I hope this helps.

Michael Mainer
  • 3,387
  • 1
  • 13
  • 32
  • Hi Michael, thanks for your thorough response. We don't need to check very frequently per message, but given the number of mailboxes (thousands) we expect to be checking a multitude of messages every night. Would a search like you describe in #3 likely be faster than filtering on InternetMessageID across three folders or a search folder? – Frank van Puffelen Jul 30 '14 at 14:31
  • I'm not familiar with the Search-Mailbox cmdlet. I only know of its existence. I'm sorry that I can't elaborate on its use. – Michael Mainer Jul 30 '14 at 17:35