-1

The problem is to find how many complete structures can be formed using the DNA chains. The rule is that the first letter of the new part has to be the same as the last letter of the previous chain.

On the first row you are given an integer: the number of chains. On the next n rows are strings: the chains.

Example:

5

ACGA

ACGA

ACAC

CCCC

CTAC

Output: 4

I tried a recursive backtracking solution, but I am sometimes getting wrong answers. I can't seem to figure out what is wrong.

ans = 0
used = {}

def place(howmanyplaced, allowedletter):
    global ans
    if howmanyplaced == num:
        ans += 1
        return ans

    for i in range(0, len(mylist)):
        if mylist[i][0] == allowedletter and used[i] == False:
            allowedletter = mylist[i][-1]
            used[i] = True
            place(howmanyplaced+1, allowedletter)
            used[i] = False


num = int(input())
mylist = []
for l in range(0, num):
    i = input()
    used[l] = False
    mylist.append(i)

for k in range(0, len(mylist)):
    used[k] = True
    place(1, mylist[k][-1])
    used[k] = False  
print(ans)
Quiti
  • 132
  • 8

2 Answers2

1

My major concern with your code is how allowedletter is modified in this loop:

    if mylist[i][0] == allowedletter and used[i] == False:
        allowedletter = mylist[i][-1]
        used[i] = True
        place(howmanyplaced+1, allowedletter)

Since you're chaining sequences via recursion, not iteration, allowedletter should not be modified during this loop. Use a different variable. Below is my rework of your program fixing this issue and rethinking the code style:

def place(how_many_placed=0, allowed_letter=None):

    if how_many_placed == number:
        return 1

    answer = 0

    for i, sequence in enumerate(sequences):
        if not used[i] and (allowed_letter is None or sequence[0] == allowed_letter):
            used[i] = True
            answer += place(how_many_placed + 1, sequence[-1])
            used[i] = False

    return answer

number = int(input())

sequences = []

used = []

for _ in range(number):
    sequences.append(input())
    used.append(False)

print(place())

See if this works any better for you.

cdlane
  • 40,441
  • 5
  • 32
  • 81
  • Great answer, thanks. One last question: why is it wrong to edit allowedletter inside the recursion? – Quiti Jan 04 '19 at 10:47
  • @Quiti, you were using `allowedletter` in the `if` statement to determine which sequences were suitable, but then also used it in the body as a temporary to hold the last character of the chosen sequence. By the time the `if` comes around again, we're testing for a different letter! In an iterative approach we'd do this at some level as we chained sequeces together. But in your recursive approach, we want to continue filtering on the *same* letter as before as the recursion will deal with the next link in our chain. – cdlane Jan 04 '19 at 17:51
0

If you can use numpy, here is a vectorized solution. np.roll shifts the list so you can compare the first letter of the next element to the last letter of the previous row.

import numpy as np

l = ['ACGA', 'ACGA', 'ACAC', 'CCCC', 'CTAC']    
a = np.array(l)    
last_letter= np.roll(a,1)[1:].view('<U1')[::len(a[-1])]
first_letter = a.view('<U1')[::len(a[0])][:-1]
sum(last_letter==first_letter)
#returns 4
dubbbdan
  • 2,650
  • 1
  • 25
  • 43