0

I am working on a problem in which I need to merge two string such that one string can be inside other one. The merged string should be of shortest length.

Example:

str1 = "AABAK"
str2 = "HYUAABA"
merged string = "HYUAABAK"

So far I was able to make it work for string which are ending in similar substrings but if they are other way round, my solution isn't working.

Failing for:

str1 = 'ctaagt'
str2 = 'gcta'
expected answer: gctaagt

Solution:

def overlap(str1, str2):
    l = min(len(str1), len(str2))
    for i in range(l, -1, -1):
        if str1.endswith(str2[-i:]):
            print('overlap ' + str2[-i:])
            return str2[-i:]


str1 = "AABAK"
str2 = "HYUAABA"

for i in range(len(str1), -1, -1):
    res = overlap(str1[0:i], str2)
    if(res):
        print('merge ' + str2+str1[i:])
        break

I also want to know if there is any better or cleaner approach to solve this.

Note: str1 is on purpose shorter for my testing purpose.

Aurora Wang
  • 1,822
  • 14
  • 22
wayfare
  • 1,802
  • 5
  • 20
  • 37

3 Answers3

1

You can modify overlap to return the final overlapped string. Then, you can consider both possible arrangements of str1 and str2 and choose the final result with min for which key is the length of the string:

def overlap(str1, str2): # best possible overlap where str1 is 1st and str2 is 2nd
    for i in range(len(str1)):
        if str2.startswith(str1[i:]):
            return str1[:i] + str2
    return str1 + str2

str1 = 'ctaagt'
str2 = 'gcta'

result = min(overlap(str1, str2), overlap(str2, str1), key=len)
print(result) # gctaagt
slider
  • 12,810
  • 1
  • 26
  • 42
0

Using tobias_k's solution from a previous post as reference you can do something like the following. And then just compare the output from a,b and b,a to see which resulting string is shorter.

from functools import reduce

a = "AABAK"
b = "HYUAABA"


def overlap(a, b):
    return max(i for i in range(len(b)) if b[i - 1] == a[-1] and a.endswith(b[:i]))


reduce(lambda a, b: a + b[overlap(a, b):], [b, a])
The Pineapple
  • 567
  • 4
  • 16
0

You can calculate your overlap and do your string merge in the same loops quite simply:

def merge(str1, str2):
    str2_len = len(str2)
    for i in range(str2_len):
        # edited to only match correctly
        if str1.startswith(str2[i:]):
            return str2 + str1[str2_len-i:]
    return str2 + str1

>>> str1 = "AABAK"

>>> str2 = "HYUAABA"

>>> merge(str1, str2)

'HYUAABAK'


>>> str1 = 'ctaagt'

>>> str2 = 'gcta'

>>> merge(str1, str2)

'gctaagt'

Community
  • 1
  • 1
Aurora Wang
  • 1,822
  • 14
  • 22
  • 1
    The problem here is that if `str1 = 'gcta'` and `str2 = 'ctaagt'` this will give the wrong result. – slider Nov 23 '18 at 23:23
  • Sorry, I was in a hurry yesterday! One thing that I got wrong was that instead of checking only if the `str2` substring is in `str1`, it should be checking if the substring is in the start of `str1`. I'll edit the answer to include this. – Aurora Wang Nov 24 '18 at 22:05
  • Also, I assume that you want to merge the two strings even if they don't overlap, what might not be wat you want to. You can change this by modifying the return. – Aurora Wang Nov 24 '18 at 22:08