1

I'm running the following Python code in MapReduce:

from mrjob.job import MRJob
import collections

bigram = collections.defaultdict(float)
unigram = collections.defaultdict(float)


class MRWordFreqCount(MRJob):

    def mapper(self, _, line):
        # Now we loop over lines in the system input
        line = line.strip().split()
        # go through each word in sentence
        i = 0
        for word in line:
            if i > 0:
                hist = word
            else:
                hist = ''

            word = CleanWord(word)  # Get the new word

            # If CleanWord didn't return a string, move on
            if word == None: continue

            i += 1
            yield word.lower(), hist.lower(), 1.0

if __name__ == '__main__':
    MRWordFreqCount.run()

I get the error: ValueError: too many values to unpack (expected 2) but I can't figure out why. Any suggestions? The cmd line code I'm running is: python myjob.py Test.txt --mapper

John Vandenberg
  • 474
  • 6
  • 16
Reddspark
  • 6,934
  • 9
  • 47
  • 64
  • 1
    You are returning 3 values from `mapper` whereas you can only return 2 it seems. – Eli Sadoff Dec 03 '16 at 18:53
  • Thank. Yes you are right - MrJobs mapper function only takes a key, value as an output. https://pythonhosted.org/mrjob/guides/concepts.html#mapreduce-and-apache-hadoop – Reddspark Dec 04 '16 at 16:32

1 Answers1

3

In MapReduce job, you emit only key and value pair. To do this you may apply following type of strategy:

yield (word.lower(), hist.lower()), 1.0
ugur
  • 400
  • 6
  • 20