-1

*** UPDATE **** The additional letter "A" was a problem. I'm rephrasing this here. Perhaps clearer? I have to replace values in a list using a dictionary which has lists of variable lengths as its values. For example:

variants = {C:["cat", "can", "car"], D:["do","die"], Z:["zen", "zoo"]}

And a list:

Letters = ["C", "D", "Z"]

I want a list output like this

PotentialWords = [["cat", "do", "zen"], ["can", "die", "zoo"],["car", "do", "zen"], ["car", "die", "zoo"]

where all the elements get updated at each step, but if the index exceeds, then the updates are preserved and we get all the variants crossed with each other.

What I have so far is:

max_len = max([len(words) for words in variants.values()])

for i in range(max_len):
    var = []
    for let in Letters:
        if let not in variants.keys():
           var.append(let)
        else:
            if i < len(variants[let]):
              var.append(variants[let][i])
            elif i > len(variants[let]):
              var.append(let)

Which gives the erroneous output:

OutputWords = [["cat", "do", "zen"], ["can", "die", "zoo"], ["car"]]

All your kind help will be deeply appreciated :) * UPDATE* This question has been updated to make clearer, thanks to the commenters. Previous input was

Letters = ["C", "D", "Z", "A"]

And output

[["cat", "do", "zen", "A"], ["can", "die", "zoo","A"],["car", "A"]]

** please look at only the above input/outpt

  • 2
    Can you further explain the formula for getting the output given the inputs? Why is `"A"` in there at all? Why is the output length 4? Why isn't the 3rd one `["car", "D", "Z", "A"]`? – blueteeth Dec 08 '19 at 09:52
  • Thank you for your response! The "A" is required because some letters would have variants and some not. The ones that don't, will need to be retained as such. The 3rd one is not `["car", "D", "Z", "A"]`, because of the nested `elif` loop. If I could just do `else: var.append(let)` on the last two lines, then we would get that output. But that's still not the desired one. :( – user3116297 Dec 08 '19 at 09:56
  • 1
    The purpose of this code is unclear. Why would you add the letter to the dictionary? – egur Dec 08 '19 at 09:59
  • In addition to @blueteeth's comment: You want to select the i-th element of the lists given in the dicts. Considering `i=2`, selects `car` from the first list and, since both of the other lists do not contain that much elements, 'overflow' on the other lists. I think, I got that so far. However, I do not get the point about how to select `A` from the letters list. From my point of view, with `i=2` you should select `Z` from this list. – albert Dec 08 '19 at 10:05
  • I have just added an edit to the code. Would it be please possible to look at it now? I apologize for the earlier confusion. I have deleted the A, will handle it separately. Thanks so much for all the clarification and understanding. Yes, it's the "overflow" that's the problem. – user3116297 Dec 08 '19 at 10:06
  • It seems like in your example for correct output, for the first list you go thru it till the end then give the final value from then on, while for others you loop. Is that intentional? – jeremy_rutman Dec 08 '19 at 10:13
  • Or is it that you want to loop all the lists until none of them have any elements remaining? – blueteeth Dec 08 '19 at 10:19

3 Answers3

0
0   1   2   3
a0  a1  a2
b0  b1  b2  b3
c0  c1

Say you have these lists, and the letters are [a, b, c], I think what you're trying to do is pad out everything to the same length as a the longest, and then zip everything.

So it would become this, and then you could read off the columns

0   1   2   3
a0  a1  a2  a0
b0  b1  b2  b3
c0  c1  c0  c1

If that's what you want to do, does this solve it?

max_len = max([len(words) for words in variants.values()])
results = []
for i in range(max_len):
    var = []
    for let in Letters:
        index = i % len(Letters[let])
        var.append(Letters[let][index])
    results.append(var)
blueteeth
  • 3,330
  • 1
  • 13
  • 23
  • Thank you so much for this excellent way of representation. Please allow me to test it against my data and return to you. – user3116297 Dec 08 '19 at 11:20
0

The power of functional programming comes into a play. I've used well known functions of itertools: cycle, islice and zip:

from itertools import cycle, islice
variants = {"C":["cat", "can", "car"], "D":["do","die"], "Z":["zen", "zoo"]}
letters = ["C", "D", "Z"]

def take_first_n_items(iterable, n):
    return list(islice(iterable, max_len)) 

max_len = max([len(words) for words in variants.values()])
phrase_iterables = [cycle(variants[letter]) for letter in letters]
full_values_of_variants = [take_first_n_items(iterable, max_len) for iterable in phrase_iterables]
results = list(zip(*full_values_of_variants))
print(results)

And there's how it works:

#phrase_iterables = [('cat', 'can', 'car','cat',...), ('do', 'die', 'do', ...) , ('zen', 'zoo', 'zen', 'zoo',...)]
#full_values_of_variants = [['cat', 'can', 'car'], ['do', 'die', 'do'], ['zen', 'zoo', 'zen']]

Output: [('cat', 'do', 'zen'), ('can', 'die', 'zoo'), ('car', 'do', 'zen')]

Note: list(islice(iterable, max_len)) uses extra iterations which is not neccessary. You can replace it with islice(iterable, max_len) instead.

Community
  • 1
  • 1
mathfux
  • 5,759
  • 1
  • 14
  • 34
  • 1
    Thank you so much for introducing this new method. The desired output would be `[('cat', 'do', 'zen'), ('can', 'die', 'zoo'), ('car', 'do', 'zen'), ('car', 'die', 'zoo')]`. Is there a reason why the last value is missed? I will just check against my data, work through your code and return to you. – user3116297 Dec 08 '19 at 11:21
  • I'm not sure how did you get extra term starting with 'car'? My looping finishes within the last term of value of variants that has the max length. Thus my array has three items only. – mathfux Dec 08 '19 at 11:30
  • Hi, thanks so much for responding. Letter "C" alone has 3 variants. The others have 2. I was hoping to get the 3rd variant of letter C(car), should be attached to both variants (do/die, zen/zoo) of the previous letters. Right now, it seems to attach to the first one. Does that make sense? – user3116297 Dec 08 '19 at 11:37
  • Yes, it does. But what other letters? Shouldn't they have both variants (do/die, zen/zoo) as well? – mathfux Dec 08 '19 at 11:40
  • I mean, why not to include cases ('cat', 'die', 'zoo'), ('can', 'do', 'zen') too within the same logics? – mathfux Dec 08 '19 at 11:42
  • I understand now, thanks so much! Yes, that would actually be better. Generate a cross matrix of all possible variants. Okay, will try that and return. – user3116297 Dec 08 '19 at 11:48
  • Ok, this requires major changes in my script. But fortunately, we can use some mathematics to calculate whats is the length of all the cycles. In your case it would be lcm(3,2,2) = 6 where lcm is least common multiple. In general you'd like to replace `max([len(words) for words in variants.values()])` with `np.lcm.reduce([len(words) for words in variants.values()])` – mathfux Dec 08 '19 at 11:58
  • don't forget to use `import numpy as np` in this case – mathfux Dec 08 '19 at 11:59
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/203841/discussion-between-user3116297-and-mathfux). – user3116297 Dec 08 '19 at 12:12
0

I would approach this problem through graphs. Imagine each list to be at certain depth level in the graph.

     root
   /  |  \
 cat can  car
  |\  /|\ /|
 do   die
  |\  /|
 zen zoo

Here, we assume a root element "root" and the list Letters to be the order in which each list will appear in the graph. So C : [cat, can, car] will be at depth 1, D: [do, die] will be at the level 2..and so on.

We can easily create this undirected graph using the following function:

variants = {"C":["cat", "can", "car"], "D":["do","die"], "Z":["zen", "zoo"]}

Letters = ["C", "D", "Z"]

def create_graph(list_dict, depth_order):
  graph = {"root": set(list_dict[depth_order[0]])}
  for i in range(1, len(depth_order)):
    parents = list_dict[depth_order[i-1]]
    children = list_dict[depth_order[i]]
    # Connect parent to children
    for elem in parents:
      graph[elem] = set(children)
    # Connect children tp parent
    for elem in children:
      graph[elem] = set(parents)
  return graph

graph = create_graph(variants, Letters)
print(graph)

Please note that each level if fully connected to the next level. So if we find the paths from one element in level 1 to another element in the last level, we find all the variants of the combination. We can do this simply using Depth First Search (link)

def dfs_paths(graph, start, goal):
  stack = [(start, [start])]
  while stack:
    (vertex, path) = stack.pop()
    for next in graph[vertex] - set(path):
      if next == goal:
        yield path + [next]
      else:
        stack.append((next, path + [next]))

Now we can call this function with desired start and end nodes:

print(list(dfs_paths(graph, 'cat', 'zen')))

This yields:

[['cat', 'die', 'zen'], ['cat', 'die', 'zoo', 'do', 'zen'], ['cat', 'do', 'zen'], ['cat', 'do', 'zoo', 'die', 'zen']]

We can filter this list to have elements only with length = len(Letters), which will eliminate the indirect paths where DFS traces a previous level.

Output:

[['cat', 'die', 'zen'], ['cat', 'do', 'zen']]

Let me know if it works for you. Cheers!