2

Context

  • Active Directory instance with 200k Users in an OU

  • Potential to grow upto 1 M in our domain

Not an expert in this ground. I am trying to come up with a solution design for a scenario where I need to poll AD and

  • Check PwdLastSet and LastLogonDate property

  • Take 6 different decisions (strategies) based on their values per user (lock account, send email etc.)

Ideally, if it was database, I would have the option to

  • Open a connection
  • Read a small page worth of data
  • Close connection
  • Process them in memory via the application
  • Repeat

(and thus leave it to the connection pool to juggle stuff and allow others to do their stuff).

I am really interested in knowing the best practice / approach in this case which is scaleable. I only need to fetch those 2 properties for all users (of course we have filters - e.g. remove inactive)

Personally, I was wondering if I should

  • Use our custom scheduler service to run Powershell (or .NET), use DirectorySearcher, open a connection (ssl), read 100 / 1000 users at a time using paging and process them in memory. Connection to AD remain open

  • Open connection to AD, get a dump of all users in a CSV (paged), close connection, write that to database for other tasks to process. But then this will have to be a nightly job with potential volume rights to the DB.

  • Replicate those two properties in a Database table and keep them in sync whenever they change in AD via our application. Consume data from here.

and so on.

Suggestions?

Ansgar Wiechers
  • 193,178
  • 25
  • 254
  • 328
MSI
  • 1,124
  • 8
  • 23

1 Answers1

0

Filter at source wherever possible. Return minimal property sets wherever possible.

You could leverage DirectorySynchronization, but I would say there is only value in doing so where you have a significant amount of client-side calculation to do. I use this to technique to manage photos in AD. I have a offline synchronized set which has the photo as a hash that can be used to ensure I only update where required, and when I update I only pull down changes from the directory since last execution.

For actions based on pwdLastSet or lastLogonTimeStamp I will always generate (LDAP) filters that allow me to ask AD for the smallest result set. I will always request the smallest number of attributes I actually need to work with.

Chris Dent
  • 3,910
  • 15
  • 15
  • Thanks for the comment. My only concern is, even the smallest set post application of filter could end up being 100k / 200k+ records or more. Really keen to know what's the best practice for that? Keep connection open to AD till they are all read from AD? Is that feasible? Is that ok to establish an SSL conn to AD and using searcher read those two properties for say 800k users in one go? – MSI Aug 10 '16 at 08:29
  • @MSI Yes, it's feasible. Assuming you only retrieve the minimum number of properties, 200k+ objects is not an insane amount – Mathias R. Jessen Aug 10 '16 at 08:31
  • AD should be perfectly capable of returning that for 800k users, however I would very seriously consider throwing ADSI (used by the DirectorySearcher) and go straight for System.DirectoryServices.Protocols. It's faster, more efficient, newer and you have a much greater degree of control. – Chris Dent Aug 10 '16 at 08:38
  • 1
    A caveat needs to be attached to the "it's faster". The improvements there are client side, AD itself isn't aware because S.DS.P and S.DS (ADSI) work with LDAP at the bottom of the stack. There's a nice diagram of this here: https://msdn.microsoft.com/en-us/library/bb332056.aspx – Chris Dent Aug 10 '16 at 08:44
  • Thanks guys, so are we saying it's perfectly fine to do a paged read under a single open connection for such volume? I am still mining the web to get a better understanding re how a PrincipalContext, DirtoryContext connection behaves as opposed to a DB connection :) – MSI Aug 10 '16 at 10:53
  • I don't see why not. You're not asking AD to do anything it's not designed to do. Once you've been as polite as you can to the domain controllers by making your request exactly as small as you need considerations tend to float to the client-side implementation. – Chris Dent Aug 10 '16 at 11:10