4

I am writing a method in C# which should query Active Directory and find all users and groups with a display name of the format {displayName} (wildcard search with both leading and trailing wildcard), the method will be used for an autocomplete field.

The problem is the performance of the method I wrote is really poor, attempting to query AD takes anything between 30 seconds and a full minute, depending on the query string.

My organization's AD is very large but if it takes this long the autocomplete field will be pointless.

Here is the code I am using right now:

// Intialize the results list.
result.queryResult = new List<Classses.ADSearchObject>();

// Set up domain context.
PrincipalContext pc = new PrincipalContext(ContextType.Domain, Domain, Constants.adQueryUser, Constants.adQueryPassword);

// Set up a directory searcher.
DirectorySearcher dSearcher = new DirectorySearcher();
// Define a SearchCollection to store the results.
SearchResultsCollection searchCol;
// Define returned result paging for performance.
dSearcher.PageSize = 1000;
// Define the properties to retrieve
dSearcher.PropertiesToLoad.Add("sAMAccountName");
dSearcher.PropertiesToLoad.Add("displayName");
// Define the filter for users.
dSearcher.Filter = $"(|(&(displayName = {result.querystring}*)(objectCategory=person))(&(displayName=*{result.querystring})(objectCategory=person)))";

// Search based in filter and save the results.
searchCol = dSearcher.FindAll();

// Add the results to the returned object 
foreach (SearchResult searchResult in searchCol)
{
   DirectoryEntry de = searchResult.GetDirectoryEntry();
   // Code to get data from the results...
}

// Define the filter for groups.
dSearcher.Filter = $"(|(&(displayName={result.querystring}*)(objectCategory=person))(&(displayName=*{result.querystring})(objectCategory=person)))";

// Search based in filter and save the results.
searchCol = dSearcher.FindAll();

// Add the results to the returned object 
foreach (SearchResult searchResult in searchCol)
{
   DirectoryEntry de = searchResult.GetDirectoryEntry();
   // Code to get data from the results...
}

Currently the search is divided to users and groups to make it easy to distinguish between them but if it increases performance substantially I will unify them to a single search.

Edit: As the user rene suggested, I used a Stopwatch to check the time it takes for FindAll and I also checked how long my foreach loops take.

I found out that the FindAll calls take about 100ms (very fast) even when searching with a leading wildcard (which isn't) indexed by AD.

Apparently the calls that take longest are my foreach loops which take about 40 seconds (40,000ms).

I am updating the question with the code block in my foreach loops as I haven't figured out how to improve its performance:

// --- I started a stopwatch here
foreach (SearchResult searchResult in searchCol)
{
   // --- I stopped the stopwatch here and noticed it takes about 30,000ms
   result.code = 0;

   DirectoryEntry de = searchResult.GetDirectoryEntry();

   ADSearchObject adObj = new ADSearchObject();

   adObj.code = 0;

   if (de.Properties.Contains("displayName")
   {
        adObj.displayName = de.Properties["displayName"].Value.ToString();
   }

    adObj.type = "user";

    result.queryResults.Add(adObj);
}

Note where I started and stopped my 'Stopwatch' in my updated code, I don't know why beginning the loop takes so long.

Mor Paz
  • 2,088
  • 2
  • 20
  • 38
  • It's the way it works when you deal with Microsoft's AD. Maybe you can think of synchronizing the AD tree with some kind of DB that will allow you to perform queries in milliseconds, instead of querying AD every time. – Pablo Recalde Feb 05 '17 at 13:37
  • 2
    Wouldn't `(&(displayName=*{result.querystring}*)(objectCategory=person))` do the same as your filter? And can you StopWatch the `FindAll` calls. Are those taking the most time? – rene Feb 05 '17 at 13:54
  • @rene please look at the edit I made to the post, I added times taken from `Stopwatch` I opened at different parts of the code – Mor Paz Feb 05 '17 at 15:43
  • Retrieve exactly what you need with a single call to `FindAll()`. Don't use `GetDirectoryEntry()` to retrieve full entries one by one without reason. No wonder you get old staring at your screen. – marabu Feb 05 '17 at 17:01
  • @marabu please refer to my edit, the performance hit occurs when entering the `foreach` loop. After the initial entry to the loop, performance is as fast as you might expect, and the call to `GetDirectoryEntry()` doesn't take particularly long – Mor Paz Feb 05 '17 at 22:52
  • I'm pretty sure the _displayName_ is not indexed at AD by default, so the substring search _(displayName=*{result.querystring}*)_ results in a "full table scan" (to lend some expression from databases). You may use _(cn=*{result.querystring}*)_ as filter instead (if possible) as this attribute is indexed by default. – Bernhard Thalmayr Feb 06 '17 at 07:57
  • DirectorySearcher internally uses pagination, so on result enumeration another page may be loaded. To increase performance you can try using LDAP classes from System.DirectoryServices.Protocols namespace as DirectorySearcher is based on ADSI API which uses LDAP under the hood. Also you can look at Ambiguous Name Resolution (ANR) Active Directory feature – oldovets Feb 09 '17 at 20:05

1 Answers1

2

Of course, a substring match is more costly than an equality match for a unique value. Also it doesn't surprise the lion's share of elapsed time falls into your iterator block, which consumes 40s overall according to your profiling.

If you are convinced that a huge drop in performance occurs just by setting up an iterator, I'm not - and that's because of your choice of timing points.

StartClock("foreach");
foreach (SearchResult searchResult in searchCol)
{
    // use an empty block to speed things up or
    StopClock("foreach");
    // whatever
    RestartClock("foreach");
}
StopClock("foreach");
LogClock("foreach");

I expect a huge performance gain (for large entry numbers) if you pay heed to a best practice I already commented on: Send a single request to the server recieving all you need in your search result, and don't send another request for each item. While a single call to GetDirectoryEntry() will only consume <1ms, the large number of entries will make your code useless for your application autocompletion feature.

Kudos to @rene for presenting a normal form for that filter expression. I don't know about filter optimization in Active Directory, so I would take the sure path with

(&(objectCategory=person)(displayName=*{result.querystring}*))
marabu
  • 1,166
  • 7
  • 9