0

Edit:
The accompanying code had a few bugs that were otherwise unrelated to the question as titled. The answer is simple enough, so I'm eliminating the irrelevant code so that the question, as asked in the title, and answer are more clearly intelligible to those searching for such things. Thanks to everyone who took the time to read the code and give me some feedback!

Original question, abridged: I seem to be having trouble when my Python script tries to access the dict entry '"__main__":', which is keyed by '__name__ =='. Is my problem related to use of these strings as variables, or is it more likely that my script is failing elsewhere? (SPOILER: My algorithm was wrong.)

eenblam
  • 438
  • 1
  • 6
  • 20
  • 1
    To answer your actual question: yes, it's a hashable (string) object, therefore it can be a key. Your 'error' is a `KeyboardInterrupt` - what exactly is the problem? – jonrsharpe Apr 28 '14 at 16:51
  • The problem is that it hangs (seemingly indefinitely) whenever the 2-gram "\__name__ ==" occurs. I've replicated it numerous times. I set the default probability for every possible word that could occur to 1.0, so I would think that the algorithm should find an acceptable word very quickly. – eenblam Apr 28 '14 at 17:02
  • It just occurred to me that I haven't accounted for redundant quotes, as we would see in `\__name__ == "\__main__"` Investigating now. – eenblam Apr 28 '14 at 17:05
  • When debugging a problem like this, start by confirming *exactly* what values are being used in `gibbs_sample_data`, where I assume you are getting stuck in an infinite loop because `probability` is something that is always less than the return value of `random.random()`. – chepner Apr 28 '14 at 17:08
  • 1
    Without analyzing specifically why it seems to consistently fail for a certain string, I can tell you that your random selection algorithm has an inherent flaw that can lead to an infinite loop. Instead of just trying again when an invalid selection is made, you should rewrite it so that invalid selections *aren't* among the possible choices. – nmclean Apr 28 '14 at 17:10
  • @chepner The only word that **should** follow `__name__ ==` is `"__main__"`, with a probability of 1.0. I set **all the other probabilities** (for every word in the vocabulary) to 1.0 as well, and I still saw no resolution. – eenblam Apr 28 '14 at 17:11
  • @nmclean My previous algorithm didn't implement Gibbs sampling- it just used random.sample() on the known words that followed. The problem I ran into follows: for a given n-gram, determining the next word is easy, but ensuring that it wouldn't produce a dead end, so to speak, was less practical. The Gibbs sampling implementation was meant to account for instances where nothing followed. E.g. suppose [-3:-1] and [-2:] are both unique and that [x,-3] -> [x,-3,-2]. [-3:-1] -> [-3:] is valid, so [x,-3,-2] -> [x,-3,-2,-1] is valid. Does this case not require looking several steps forward? – eenblam Apr 28 '14 at 17:25
  • 1
    My point is, in both cases where you call `random.sample` (`ngram_to_kwords` and `gibbs_sample_data`), the possibility exists of choosing an item that failed previously, allowing it to loop continuously without success. You need to somehow remove the failed attempts from the pool of choices until the loop is finished. – nmclean Apr 28 '14 at 17:51
  • You do have a good point. It doesn't address the posted case where any and every case should succeed, but it's something I'll definitely incorporate. Thanks! EDIT: It *is* getting stuck inside of `gibbs_sample_data`. The `if` statement even evaluates to `True`. The function, for some reason, just isn't returning `word` in this particular case, and it continues looping instead. – eenblam Apr 28 '14 at 17:57

1 Answers1

1

Both '__name__' and '==' can serve as the key of dictionary:

>>> d = {'__name__':1, '==':2}
>>> d['__name__']
1
>>> d['==']
2
Mingyu
  • 31,751
  • 14
  • 55
  • 60
  • Thanks for answer! I'll grant that you did, more or less, answer the question. That works, as does "__name__ ==" (the string in question). Unfortunately, you helped me see that I should rephrase the question. If it seems that it really begs a separate post, I'll come back and select this answer as best. – eenblam Apr 28 '14 at 16:58