Naive Bayes in Python

Question

I'm trying to do Laplace smoothing on my Naive Bayes code. It gives me 72.5% accuracy on 70% train 30% test set, which is kinda low. Does anyone see anything wrong?

posTotal=len(pos)
negTotal=len(neg)

for w in larr:
  if (w not in pos) or (w not in neg):
    unk[w]+=1
    unkTotal=len(unk)
  else:
    if (w in pos):
      posP+=(math.log10(pos[w])-math.log10(posTotal))
    if (w in neg):
      negP+=(math.log10(neg[w])-math.log10(negTotal))

pos and neg are a defaultdic.

It would probably help to fix all of your indentations, it's not exceedingly readable — , Nov 12 '13 at 16:31

score 0 · Answer 1 · answered Nov 14 '13 at 00:31

0

My Python's a little rusty, but for the if, don't you want if (w not in pos) and (w not in neg)? Seems like this version would only adjust your scores for words that are somehow found in both pos and neg.

answered Nov 14 '13 at 00:31

Josh

1,563
11
16

Naive Bayes in Python

1 Answers1