0

I have a list of strings called words such that

words = ['house', 'garden', 'kitchen', 'balloon', 'home', 'park', 'affair', 'kite', 'hello', 'portrait', 'angel', 'surfing']

I have to find the most occurring letter in every position the strings, example, let's find the most occurring first letter, so I'll check every first letter of my strings and get 'h' because is the letter that most repeat it self. (If I get two letters that repeat themselves the same amount of times I'll consider the alphabetic order), so the second letter is 'a' because is the letter that repeat itself most time at the second position of all letters, then 'r' because of every third letter in every string is the one that is repeated mostly and so on, at the end I want the string maxOccurs = "hareennt" that is a string that contains all the most frequent letter. This is what I coded so far:

maxOccurs = ""
listOfChars = []

for i in range(len(words)):
    for item in words:
        listOfChars.append(item[i])

    maxOccurs += max(set(listOfChars), key=listOfChars.count)
    listOfChars.clear()

It raises me and index error out of bound when i == 4, obviously because not every letter has the same length, but I cannot get done with it, I will appreciate any help. P.S. I can't use any import.

MattMlgn
  • 33
  • 7

2 Answers2

0

This works:

words = ['house', 'garden', 'kitchen', 'balloon', 'home', 'park', 'affair', 'kite', 'hello', 'portrait', 'angel', 'surfing']


maxOccurs = ""
listOfChars = []

for i in range(len(max(words, key=len))):
    for item in words:
        try:
            listOfChars.append(item[i])
        except IndexError:
            pass

    maxOccurs += max(sorted(set(listOfChars)), key=listOfChars.count)
    listOfChars.clear()

I made 3 changes to your code:

  1. Iterate by the length of the longest word in the outer for-loop
  2. Access the characters of the string in a try-block, to deal with different-length words
  3. Sorting the set of most used characters to consider alphabetic order in the case of same number of appearance

If imports where allowed, I would do this:

from statistics import mode
from itertools import zip_longest


words = ['house', 'garden', 'kitchen', 'balloon', 'home', 'park', 'affair', 'kite', 'hello', 'portrait', 'angel', 'surfing']
maxOccurs = "".join(mode("".join(chars)) for chars in zip_longest(*words, fillvalue=""))
MangoNrFive
  • 1,541
  • 1
  • 10
  • 23
  • This worked thanks, but what if I needed a more efficient solution? Without imports? For very large lists this is taking forever – MattMlgn Nov 11 '22 at 17:59
  • For a list of more then a million elements it takes about 2 seconds on my setup, that doesn't seem so bad to me. Even when using the `itertools` / `statistics` version (the one-liner) it is only twice as fast so I don't think there is much room for improvement, especially not without using any imports. – MangoNrFive Nov 12 '22 at 16:39
0

The standard library is full of nice utilities for counting. Here's a one-liner that does it:

>>> from collections import Counter
>>> from itertools import zip_longest
>>> words = ['house', 'garden', 'kitchen', 'balloon', 'home', 'park', 'affair', 'kite', 'hello', 'portrait', 'angel', 'surfing']
>>> ''.join(Counter(filter(None, chars)).most_common(1)[0][0] for chars in zip_longest(*words))
'horeennt'

The only difference is it returns 'horeennt' instead of 'hareennt' because o and a apply equally frequently in the second place, and Counter.most_common(1) will return the first item encountered if there's a tie.

Iguananaut
  • 21,810
  • 5
  • 50
  • 63