0

I've been working through a problem from my machine learning class that I can not seem to figure out.

The gist of the algorithm if I'm understanding it correctly is:

Expectation:

• For each sentence s in S:
    ○ For each word/tag pair (w,t):
        § For every occurence of w (at position i) in s:
            □ EmissionCounts(w,t) += (forward[t][i]*backward[t][i])/(sum of forward[tag][N] for all tags)
    ○ For every tag/tag pair:
        § For every adjacent pair of words (starting at position i):
            □ TransitionCounts(t1,t2) += forward[t1][i]*P(t2|t1)*P(w[i+1]|t2)*backward[t2][i+1] / (sum of forward[tag][N] for all tags)
    ○ For every tag:
        § For the first word in the sentence:
            □ InitialCounts(t) = pi(t)*P(w[1]|t)*backward[t][1] / (sum forward[t][N] for all tags)
• For each tag t:
    ○ For every word w:
        § TagCounts(t) += EmissionCounts(w,t)

Maximization:

• PI(t) = InitalCounts(t)/(# sentences)
• P(t2|t1) = TransitionCounts(t1,t2)/TagCounts(t1)
• P(w|t) = EmissionCounts(w,t)/TagCounts(t)

Check for convergence:

Here's a link to my baum welch algorithm. Anyone have any ideas as to what I may be doing wrong?

https://gist.github.com/dmcquillan314/4058b9048799e3488a05

Here's a link to the entire repo it's from as well: https://github.com/dmcquillan314/HW6

dmcqu314
  • 875
  • 11
  • 29
  • What makes you think you've got it wrong? Are you getting some sort of an error? Incorrect output? – seaotternerd May 03 '15 at 19:32
  • Per the assignment description it's supposed to have ~90% accuracy. Also it seems strange to me that the for every sentence it classifies each term with a tag of a "," – dmcqu314 May 04 '15 at 16:23

0 Answers0