1

I have the following dataframe

userID  movieID rating  timestamp
1   1   9   12
1   2   10  13

I called this dataframe mapper1.txt and stored it in the same dir as this python file:

from mrjob.job import MRJob

class MRRatingCounter(MRJob):
    def mapper(self, key, line):
        (userID, movieID, rating, timestamp) = line.split('\t')
        yield rating, 1

    def reducer(self, rating, occurences):
        yield rating, sum(occurences)

if __name__ == '__main__':
    MRRatingCounter.run()

No I would like to run this function with running the following code

!python Rating-Counter.py mapreduce\mapper1.txt

This however throws the following error:

!python Rating-Counter.py mapreduce\mapper1.txt
no configs found; falling back on auto-configuration
no configs found; falling back on auto-configuration
Traceback (most recent call last):
  File "Rating-Counter.py", line 12, in <module>
    MRRatingCounter.run()
  File "C:\Users\Marc\AppData\Local\Enthought\Canopy\User\lib\site-packages\mrjob\job.py", line 461, in run
    mr_job.execute()
  File "C:\Users\Marc\AppData\Local\Enthought\Canopy\User\lib\site-packages\mrjob\job.py", line 479, in execute
    super(MRJob, self).execute()
  File "C:\Users\Marc\AppData\Local\Enthought\Canopy\User\lib\site-packages\mrjob\launch.py", line 153, in execute
    self.run_job()
  File "C:\Users\Marc\AppData\Local\Enthought\Canopy\User\lib\site-packages\mrjob\launch.py", line 216, in run_job
    runner.run()
  File "C:\Users\Marc\AppData\Local\Enthought\Canopy\User\lib\site-packages\mrjob\runner.py", line 470, in run
    self._run()
  File "C:\Users\Marc\AppData\Local\Enthought\Canopy\User\lib\site-packages\mrjob\sim.py", line 164, in _run
    _error_on_bad_paths(self.fs, self._input_paths)
  File "C:\Users\Marc\AppData\Local\Enthought\Canopy\User\lib\site-packages\mrjob\sim.py", line 549, in _error_on_bad_paths
    "None found in %s" % paths)
ValueError: At least one valid path is required. None found in ['mapreduce\\mapper1.txt']

!python Rating-Counter.py mapreduce\mapper1.txt

I do not understand whats going wrong however. Could anybody explain to me why this code is not working?

John Vandenberg
  • 474
  • 6
  • 16
Frits Verstraten
  • 2,049
  • 7
  • 22
  • 41
  • Are you using/dependent on any thing like mrjob.conf ? if yes then try appending `--conf-path mrjob.conf` as well and call your script. – Raja G Jun 09 '16 at 13:20
  • Are you sure the \ in the file path shouldn't be a / slash? The error is `At least one valid path is required. None found in ['mapreduce\\mapper1.txt']`. – Binary Nerd Jun 09 '16 at 13:53

1 Answers1

0

The input file content should be tab separated..

| userID | movieID | rating | timestamp |
|--------|---------|--------|-----------|
| 1      | 1       | 9      | 12        |
| 1      | 2       | 10     | 13        |

And the command to run is

!python Rating-Counter.py mapper1.txt

make sure both py file and txt file are in same folder / current directory

My output for this:

C:\>python mrjobfirst.py
No configs found; falling
Creating temp directory c
0170424.081057.565000
Running step 1 of 1...
Streaming final output fr
mm.20170424.081057.565000
"10"    1
"9"     1
"rating"        1
Removing temp directory c
0170424.081057.565000...
rony36
  • 3,277
  • 1
  • 30
  • 42