Using MapReducer MRJob and my mapper function gives me an indexerror: list index out of range

Question

I am new to MapReduce MRJob (and also to Python to be honest). I am trying to use MRJob to count the number of combinations of pairs of letters in different columns, from "A" to "E", that I have in a text file, i.e. "A", "A" = 10 occurences, "A", "B" = 13 occurences, "C", "E"= 6 occurences, etc. The error I get when I run it is a "list index out of range" and for the life of me, I can't figure out why.

Here is a sample of the text file used in conjunction with the python mapreduce file with the mapper and reducer functions (by the way, the string has a date, a time, the duration of a phone call, a customer ID of the person making a call that begins with a letter from "A" to "E" where the letter designates a country, another customer ID of the person receiving a call and key words in the conversation). I broke down the string into a list and in my mapper indicated the index I am interested in, but I am not sure if this approach is correct:

Details
2020-03-05 # 19:28 # 5:10 # A-466 # C-563 # tendremos lindo ahi fuimos derecho carajo junto acabar
2020-03-10 # 05:08 # 5:14 # C-954 # D-353 # carajo calle película acaso voz creía irá san montón ambos hablas empieza estaremos parecía mitad estén vuelto música anoche tendremos tenían dormir habitación encuentra ésa
2020-01-15 # 09:47 # 4:46 # C-413 # B-881 # pudiera dejes querido maestro hacerle llamada paz estados estuviera hablo decirle bonito linda blanco negro querida hacerte dormir empieza mayoría
2020-01-10 # 20:54 # 4:58 # E-027 # A-549 # estuviera tuviste vieja volvió solía alrededor decía maestro estaremos línea sigues
2020-03-17 # 21:38 # 5:21 # C-917 # D-138 # encima música barco tuvimos dejes damas boca

Here is the entire code of the python file:

from mrjob.job import MRJob

class MRduracion_llamadas(MRJob):
    def mapper(self, _, line):

        """
         First we need to convert the string from the text file into a list and eliminate the 
         unnecessary characters, such as "#", "-", ":", which I have substituted with a ";" to 
         facilitate the "split"part of this process. 
        """
        table = {35 : 59, 45 : 59, 58 : 59}
        llamadas2020_text_line = [column.strip() for column in \
                                 (line.translate(table)).split(";")]
        #Now we can assign values to "Key" and "Values"
        print(line)
        pais_emisor = llamadas2020_text_line[7]
        pais_receptor = llamadas2020_text_line[9]
        minutos = ""

        #If a call is "x" minutes and "y" secs long, where y > 0, then we can round up 
        #the minutes by 1 minute.

        if int(llamadas2020_text_line[6]) > 0:
            minutos = int(llamadas2020_text_line[5]) + 1
        else:
            minutos = int(llamadas2020_text_line[5])

        yield (pais_emisor, pais_receptor), minutos

    def reducer(self, key, values):
        yield print(key, sum(values))


if __name__ == "__main__":
        MRduracion_llamadas.run()

Using MapReducer MRJob and my mapper function gives me an indexerror: list index out of range

0 Answers0