-1

I was solving a question today on Leetcode named Lexicographically Smallest Equivalent String (link) and I came up with a solution of DFS with some customization to solve the problem and here's is my code.

#!/usr/bin/python3

from collections import defaultdict, OrderedDict

class Solution:
    def DFSMap(self, node, map, adj, visited):
        if node not in visited:
            visited.add(node)
            for i in adj[node]:
                map[node]=min(map[node] or "z", node, self.DFSMap(i, map, adj, visited))
            return map[node]
        else:
            return map[node] or node

    def smallestEquivalentString(self, s1: str, s2: str, baseStr: str) -> str:
        referenceMap = defaultdict(set)
        d=OrderedDict()
        n=len(s1)
        for i in range(n):
            referenceMap[s2[i]].add(s1[i])
            referenceMap[s1[i]].add(s2[i])
        for j in sorted(referenceMap.items(),key=lambda x:x[0]):
            d[j[0]]=j[1]

        m=defaultdict(lambda:None)
        visited=set()
        for i in d:
            if i not in visited:
                self.DFSMap(i,m,d,visited)

        res=""
        print(m)
        for i in baseStr:
            res+=(m[i] or i)
        return res


if __name__ == "__main__":
    s1=input()
    s2=input()
    baseStr=input()
    s=Solution().smallestEquivalentString(s1, s2, baseStr)
    print(s)

The issue I'm facing is, for every run of the same code, I get a different output of hashmap without usage of random functionalities in my code. Here are few snippets of the outputs.

On other run I get,

Is it the issue with recursion or any logical error?

Could someone help me figure out the reason for this with an explanation?

The output for the input s1 = "leetcode", s2 = "programs", baseStr = "sourcecode" was supposed to be "aauaaaaada" but that isn't happening certainly and consistently.

Kiran Deep
  • 83
  • 7
  • 2
    Could you add an explanation of what this is supposed to do? We don't have LeetCode problems memorized. – Barmar Jan 14 '23 at 18:38
  • Where the code says `self.DFSMap(i,m,d,visited)`, what do you expect will happen to the return value? Does that match how you want it to work? – Karl Knechtel Jan 14 '23 at 18:38
  • Don't use `map` as a variable name, it's the name of a built-in function. But this isn't related to the problem. – Barmar Jan 14 '23 at 18:41
  • 2
    Just a guess, but I suspect it has something to do with using `set` instead of `list` to hold the values in `referenceMap`, since the order of values in a set is not specified. – Barmar Jan 14 '23 at 18:44
  • I think Python has hash randomziation for strings enabled by default, since it helps make some kinds of hash-collision based DoS attacks more difficult. Thus strings will hash differently each time a program is run. – Blckknght Jan 14 '23 at 18:52
  • @KarlKnechtel I have put `self.DFSMap(i,m,d,visited)` to store the equivalent characters of the strings in a hashmap and the purpose of storing is fulfilled because of reference type to the dictionary `m`, but the output is uncertain. – Kiran Deep Jan 14 '23 at 19:08
  • Okay, but why does it return a value, if that value will be ignored? – Karl Knechtel Jan 15 '23 at 01:28

1 Answers1

1

This is caused by the non-determined order in which a set is iterated. And this impacts in which order for i in adj[node] is iterating.

For the example input you provided, the following edges are defined that relate to the letter "a" (which is the first letter for which the DFS search is performed):

            r ─── c 
            │
a ─── o ─── e 
            │
            s

The adjacency set for "a" has only one entry ("o") so no surprises there, but "o" has two neighbors, and "e" three, ...etc. The order affects the result.

Let's assume this order (adding assignments to map keys as they occur):

 DFSMap("a")
   DFSMap("o")
     DFSMap("a")
   map["o"] = "a"
     DFSMap("e")
       DFSMap("r")
         DFSMap("e")
       map["r"] = "e"
         DFSMap("c")
           DFSMap("r")
         map["c"] = "c"
       map["r"] = "c"
     map["e"] = "c"
       DFSMap("s")
         DFSMap("e")
       map["s"] = "c"
     map["e"] = "c"
       DFSMap("o")
     map["e"] = "a"
   map["o"] = "a"
 map["a"] = "a"

Note how not all "nodes" in this connected component get associated with the value "a". In this particular traversal, the letters "c", "r" and "s" remain associated with "c", instead of "a". In another order of traversal the anomalies could be elsewhere.

The random results illustrate that the algorithm is not correct. Typically when the lexical minimum letter is visited first before lots of other connected nodes, the results will incorrect, as this least letter is not propagated to nodes that are further away.

Here is a correction of your dfs-code. It takes an extra least argument to allow for that propagation:

    def DFSMap(self, node, mp, adj, visited, least="z"):
        least = min(least, mp[node] or node)
        if node not in visited:
            visited.add(node)
            for i in adj[node]:
                least = self.DFSMap(i, mp, adj, visited, least)
        mp[node] = least
        return least

NB: renamed map to mp, as the first is already used by Python for a native function.

NB: This algorithm relies on the fact that the main loop in smallestEquivalentString visits the letters in lexical order (which is the case), so that the first visited letter in a connected component is also the least in that component.

trincot
  • 317,000
  • 35
  • 244
  • 286