Here is a sample header define under rfc822, rfc2822 and MIME Now I want to create full text search using lucene. If I use standard analyzer it will create too many useless tokens which will degrade performance. Is there any way we can create good tokens by writing custom analyzer & tokenizer.
From webmaster@email.marketingmag.ca
Microsoft Mail Internet Headers Version 2.0
Received: from sdlasd02.medicis.com ([172.23.163.35]) by mpc-exchange.medicis.com with
Microsoft SMTPSVC(6.0.3790.3959); Mon, 1 Jun 2009 04:30:59 -0700
Received: from mail pickup service by sdlasd02.medicis.com with Microsoft SMTPSVC; Mon, 1 Jun 2009 04:30:59 -0700
Received: from SDLMAIL01.medicis.com ([98.175.1.32]) by sdlasd02.medicis.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 1 Jun 2009 04:30:59 -0700
Return-Path: bo-buhbpmfbpgh9f6axbzpa2ae1achzvh@b.email.marketingmag.ca
X-CTCH-ID: CFBA793F-FB3C-4DEB-A504-C6165B493680
X-CTCH-RefID: str=0001.0A090202.4A23BBF3.009A,ss=1,fgs=0
X-CTCH-Action: Ignore