In my application(C#) i need to filter emails based on their content. If an email is a double-opt in need to send it to a specified email address if it's a normal email i should send it to another email address.
I looked at the emails that come in and made a list of common words that appear in the subject for the double opt-in emails(10 - 20 words max). For each email that came in i checked if the subject contained some of the words and if they where more than 2-3 depending on the subject length i decided that was an opt-in. Problem was that this basic version didn't worked well.
I read about spam filters(basically what i want to do is similar.) and after searching for some examples on the web i found some based on Bayesian Networks. The problem with this solution is that i needed to feed in a lot of training material which i don't have yet.
How could i filter these emails based on content+subject or just subject without needing a lot of training material?
EDIT: i want to do the filtering at the email server level.