I built a custom function two compare two url's to get the longest common subsequence (lcs).
def lcs_dynamic(url1, url2):
maths: compare url1 with url2
return lcs
I have a series s1 and a series s2 with a bunch of url's (13.000pcs). I want to compare each element of both series with each other (169.000.000 comparisons)
I did it with two nested for-loops, but it's way too slow.
for index1, value1 in s1.items():
for index2, value2 in s2.items():
url1 = value1
url2 = value2
if (index1 != index2):
lcs1 = lcs_dynamic(url1, url2) //usage of my custom function
overlap = lcs1 /len(url2)
print({index1}, {index2}, {url1}, {url2}, {overlap})
Is there a better way to do it?
I thought about the apply() method, but I couldn't figure out how to get access to series2 and the second url as my custom function lcs_dynamic needs both urls as arguments
series1.apply(lcs_dynamic(url1, url2))
--> in this case I would get the url1 from series1 but how can get access to the series2 and url2... don't know.
Thanks in advance!