Finding the most occurring letter in every position of a string in a list of strings

Question

I have a list of strings called words such that

words = ['house', 'garden', 'kitchen', 'balloon', 'home', 'park', 'affair', 'kite', 'hello', 'portrait', 'angel', 'surfing']

I have to find the most occurring letter in every position the strings, example, let's find the most occurring first letter, so I'll check every first letter of my strings and get 'h' because is the letter that most repeat it self. (If I get two letters that repeat themselves the same amount of times I'll consider the alphabetic order), so the second letter is 'a' because is the letter that repeat itself most time at the second position of all letters, then 'r' because of every third letter in every string is the one that is repeated mostly and so on, at the end I want the string maxOccurs = "hareennt" that is a string that contains all the most frequent letter. This is what I coded so far:

maxOccurs = ""
listOfChars = []

for i in range(len(words)):
    for item in words:
        listOfChars.append(item[i])

    maxOccurs += max(set(listOfChars), key=listOfChars.count)
    listOfChars.clear()

It raises me and index error out of bound when i == 4, obviously because not every letter has the same length, but I cannot get done with it, I will appreciate any help. P.S. I can't use any import.

"No imports" doesn't mean the same thing as "python standard library" — Pranav Hosangadi, Nov 11 '22 at 15:01

MangoNrFive · Accepted Answer · 2022-11-12T16:37:25.010

This works:

words = ['house', 'garden', 'kitchen', 'balloon', 'home', 'park', 'affair', 'kite', 'hello', 'portrait', 'angel', 'surfing']


maxOccurs = ""
listOfChars = []

for i in range(len(max(words, key=len))):
    for item in words:
        try:
            listOfChars.append(item[i])
        except IndexError:
            pass

    maxOccurs += max(sorted(set(listOfChars)), key=listOfChars.count)
    listOfChars.clear()

I made 3 changes to your code:

Iterate by the length of the longest word in the outer for-loop
Access the characters of the string in a try-block, to deal with different-length words
Sorting the set of most used characters to consider alphabetic order in the case of same number of appearance

If imports where allowed, I would do this:

from statistics import mode
from itertools import zip_longest


words = ['house', 'garden', 'kitchen', 'balloon', 'home', 'park', 'affair', 'kite', 'hello', 'portrait', 'angel', 'surfing']
maxOccurs = "".join(mode("".join(chars)) for chars in zip_longest(*words, fillvalue=""))

This worked thanks, but what if I needed a more efficient solution? Without imports? For very large lists this is taking forever — MattMlgn, Nov 11 '22 at 17:59
For a list of more then a million elements it takes about 2 seconds on my setup, that doesn't seem so bad to me. Even when using the `itertools` / `statistics` version (the one-liner) it is only twice as fast so I don't think there is much room for improvement, especially not without using any imports. — MangoNrFive, Nov 12 '22 at 16:39

score 0 · Answer 2 · answered Nov 11 '22 at 16:04

The standard library is full of nice utilities for counting. Here's a one-liner that does it:

>>> from collections import Counter
>>> from itertools import zip_longest
>>> words = ['house', 'garden', 'kitchen', 'balloon', 'home', 'park', 'affair', 'kite', 'hello', 'portrait', 'angel', 'surfing']
>>> ''.join(Counter(filter(None, chars)).most_common(1)[0][0] for chars in zip_longest(*words))
'horeennt'

The only difference is it returns 'horeennt' instead of 'hareennt' because o and a apply equally frequently in the second place, and Counter.most_common(1) will return the first item encountered if there's a tie.

Finding the most occurring letter in every position of a string in a list of strings

2 Answers2