I've scripted a MsSqlServer databases tables,views and stored procedures into a directory structure that I am then indexing with Lucene.net. Most of my table, view and procedure names contain underscores.
I use the StandardAnalyzer. If I query for a table named tIr_InvoiceBtnWtn01, for example, I recieve hits back for tIr and for InvoiceBtnWtn01, rather than for just tIr_InvoiceBtnWtn01.
I think the issue is the tokenizer is splitting on _ (underscore) since it is punctuation.
Is there a (simple) way to remove underscores from the punctuation list or is there another analyzer that I should be using for sql and programming languages?