-1

Traceback (most recent call last): IndexError: list index out of range

from sys import argv, exit
import csv
import sys
    
def main ():
    if len(sys.argv) < 3:
        print("Usage: python dna.py data.csv sequence.txt")
        exit(1)
    # open csv_file contain the database 
    with open(sys.argv[1]) as csvfile:
        r = csv.reader(csvfile)
        # open txt_file contain the DNA sequence 
        with open(sys.argv[2]) as txtfile:
            x = txtfile.read()
            y = [v for v in [max_substring(x, seq) for seq in next(r)[1:]] if v >= 0]
        print_match(r, y)

# count the maximum substring           
def max_substring(x, sub_str):
    if not x and not sub_str:
        return -1
    sub_repeat = len(x)*[0]
    for s in range(len(x) - len(sub_str),  -1, -1): 
        a = s + len(sub_str)
        if x[s: a] == sub_str:
            b = len(x) - 1
            c = 1 + sub_repeat[a - 1]
            sub_repeat[s] = 1 if  a > b else c
    return max(sub_repeat)

# Make some assertions
assert max_substring('abcabbcab', 'ab') == 1
assert max_substring('', '') == -1
assert max_substring('abcabbcab', 'abcabbcab') == 1
    
# print the result
def print_match(r,y):
    for line in r:
        if [int(val) for val in line[1:]] == y:
            print(line[0]) 
            return
    print("No match")
# run the functions & class
if __name__ == "__main__" :
    main()

Traceback (most recent call last): IndexError: list index out of range

[new output error [1]: https://i.stack.imgur.com/lKKD8.png

  • 2
    (1) Put the code here, don't put the image. (2) Try printing the items in the loop at `max_substring` and go through them, you will get why the index is off range. – Rahul Bharadwaj Aug 31 '20 at 04:56
  • @SachinDalvi please put the code *inline* for us to read. [Edit](https://stackoverflow.com/posts/63664960/edit) the OP – M Z Aug 31 '20 at 05:02
  • Just posting part of the error message and the code isn't good enough. What line does the error occur on? What did you expect? What have you tried? – Grismar Aug 31 '20 at 05:09

1 Answers1

0

You error can happen in two cases.

Empty Inputs

Just try calling: max_substring('', '') to reproduce the error (i.e. both parameters to the function are the empty string).

In that case:

  • len(x) and len(sub_str) are both 0; therefore, len(x) - len(sub_str) is 0, and sub_repeat ends up being another empty string.
  • list(range(0, -1, -1)) is just [0]
  • Therefore, in the expression a = s + len(sub_str), s is 0, and so is a.
  • x[s: a], being x[0:0] is the empty string. Therefore [s: a] == sub_str is True, and the if statement is entered.
  • In the expression c = 1 + sub_repeat[a], since a is 0, it is trying to access the first index of sub_repeat. But because sub_repeat is an empty string, this results in the IndexError.

Non-Empty Input

  • Repeat this for non-empty parameter values, and you will see that when the substring occurs at the very end, sub_repeat[a] doesn't exist because the length of sub_repeat is only a-1. I don't know if you want to change it to c = 1 + sub_repeat[a - 1] or sub_repeat = (1 + len(x))*[0], but both of those would get rid of the error.

Possible Solution

To avoid the error, you can add a check to prevent this case of both a and sub_str both being empty, and conditionally return the expected value for that edge case, and also avoid the second edge case above (by changing to c = 1 + sub_repeat[a - 1] or sub_repeat = (1 + len(x))*[0]). I'm not sure which one is appropriate, without knowing the purpose of this function.

    if not x and not sub_str:
        return -1
    sub_repeat = len(x)*[0]
    for s in range(len(x) - len(sub_str),  -1, -1): 
        a = s + len(sub_str)
        if x[s: a] == sub_str:
            b = len(x) - 1
            c = 1 + sub_repeat[a - 1]
            sub_repeat[s] = 1 if  a > b else c
    return max(sub_repeat)

# Make some assertions
assert max_substring('abcabbcab', 'ab') == 1
assert max_substring('', '') == -1
assert max_substring('abcabbcab', 'abcabbcab') == 1

The calling code will need to handle this new return value and ignore those cases where both inputs where empty:

y = [v for v in [max_substring(x, seq) for seq in next(r)[1:]] if v >= 0]

Note that by the time print_match(r, y) is called, that r will have no more items left to iterate over, due to the list comprehension, so you may have another problem.

ELinda
  • 2,658
  • 1
  • 10
  • 9
  • hello, I tried changing def max_substring(x, sub_str): to def max_substring('x', 'sub_str'): but still error SyntaxError: invalid syntax – Sachin Dalvi Aug 31 '20 at 07:25
  • That is a syntax error because parameters in a signature must be valid identifiers. Please see the code I added above, if that helps. – ELinda Aug 31 '20 at 19:57
  • By reproducing the error, I meant to *invoke* the function with empty strings, not to define it with hard-coded strings in the parameter list. You would need to add a separate line for the test mentioned, containing only `max_substring('', '')` – ELinda Aug 31 '20 at 19:59
  • AGAIN SAME ERROR OUTPUT. Traceback (most recent call last): File "dna.py", line 38, in main() File "dna.py", line 15, in main y = [v for v in [max_substring(x, seq) for seq in next(r)[1:]] if v >= 0] File "dna.py", line 15, in y = [v for v in [max_substring(x, seq) for seq in next(r)[1:]] if v >= 0] File "dna.py", line 26, in max_substring c = 1 + sub_repeat[a] IndexError: list index out of range – Sachin Dalvi Sep 01 '20 at 08:15
  • Your output is cut off in your comment (missing the full stack trace). Did you update *both* functions? If it still does not work, then update your original post to contain more details, such as updated code, and the input that caused the error. You can add print statements to your functions, or a try-catch, to easily capture the input causing the problem – ELinda Sep 01 '20 at 17:24
  • just check if I have done correctly I have updated the code – Sachin Dalvi Sep 02 '20 at 10:31
  • Please see update -- there is another edge case. However, I'm not sure how you want to handle it. You will need to explain the purpose of your function. – ELinda Sep 02 '20 at 21:01