4

Summary

I am looking for a way to show staff that if they put just a small percentage of the time and effort they spend e-mailing each other into writing documentation, they would end up with a great support tool. I want to do this by getting word-count statistics out of MS Exchange 2007.

Background

I am helping an organisation I work with get their teams better at documenting networks and systems that they manage. We have a simple, wiki-style documentation system that we have put in place with lots of thought, design, templates and structure and so far it is working quite well, and now it is time to bring some other IT teams on-board.

One of the main issues that staff in these new teams have about providing any sort of documentation is time. They are very busy and perceive that they don't have the time to work on this kind of documentation, even though it has shown to reduce workload and time-to-restore for incidents with the teams that are already using it.

I figured a powerful metaphor for how time probably isn't the issue, is to show the teams how much time and effort they put into e-mail content each day.

Within our e-mail archives are probably countless nuggets of gold about how systems work and how problems were solved, with valuable information that would help support teams when those systems go bad. If only they had been put into a searchable wiki for everyone to see, (using the structure and templates we have provided).

The problem

I need to be able to extract raw data about how many words are typed by individuals in each e-mail that they send, summarised as a total number per day. This is tricky, as each email thread will of course contain copies of the previous e-mails that we don't want to count.

Once we have the statistics, per user per day, we can then use active directory group memberships to build totals per day for various teams, which will also anonymise the data somewhat.

What I've tried so far

I've done Google searches till my fingers bled but I don't have much knowledge of Exchange 2007 (or Windows Server for that matter - I'm a UNIX/Cisco person). I'm not sure of where in the stack the best place to get this information is and I also don't know much about the format of the mailbox stores/databases on the mailbox server.

I figure that there might be something more useful at the next layer up, query tools or database browsers and the like. I'm looking for that guidance.

GarnerCX
  • 133
  • 5
  • I don't have enough time at the moment to look into this, but Exchange has a very good Web Service API that you can knock something up with. There's a native .NET library, along with a bunch of Powershell objects. I'm confident you could do this in C#, but it would mean parsing every single mail message in the users sent items. Code to figure out which part is new and which part is the reply to, well, that's far too smart for me :) – Mark Henderson May 09 '12 at 05:32
  • IMHO word counting the staff messages is not a good enough metric, take for example an attachment. There are soooo many things that could bias your data. I would focus my efforts on other ways to convince the staff, for example rewards (or punishments). – drcelus May 09 '12 at 05:58
  • You can assume that this is not the only action. Also, counting messages is not what I want to do. It's counting the words typed. – GarnerCX May 09 '12 at 07:14
  • And how are you going to correlate the word count of all the emails to proving your case, as opposed to all of the emails that contain "Hey, girl! Did you see that crazy show last night!?", or "Sup dude, what do you want to do for lunch?". – joeqwerty May 09 '12 at 16:00
  • It doesn't matter what the subject is, it's about the volume. – GarnerCX May 09 '12 at 23:58

1 Answers1

0

I think this is what you are looking for.

It's for Exchange 2010 though, not sure if there is that kind of functionality in exchange 2007.

Paul Basov
  • 163
  • 7