Your algorithm cannot work because it never tackles with the "closest to a prime" issue (note that checking a prime like that is not efficient: the max boundary can be n**0.5+1
)
That said, there's a flaw in this problem for lower-case letters:
prime numbers skip from 113 to 127. For a high lowercase letter, the closest prime is 127, which isn't printable.
So I'll stick to uppercase letters (unless we want to print the closest lowest prime, in which case lowercase is OK)
What I would do is:
- generate/copy the list of primes from 0 to 97 (ASCII for
Z
is 96)
- loop on the characters, and use
bisect
module to find the insertion position of the character ASCII code in the prime list (which must be sorted for bisect
to work properly).
- Then check if upper bound is closer than lower bound, and choose the index of the prime list.
- Fiddle (a lot) with the insertion indices to avoid errors (like in my first version of the post)
- Add to list.
join
in the end
code:
import bisect
# list sampled from https://primes.utm.edu/lists/small/1000.txt
# ASCII code for 'A' is 65, no need to go lower
primes = [int(x) for x in """61 67 71 73 79 83 89 97""".split()]
word = "ABCDEFGHIJKLMNOPRSTUVWXYZ"
primeword = []
for w in word:
ow = ord(w)
i = bisect.bisect_left(primes,ow)
delta1 = abs(ow-primes[i])
delta2 = abs(ow-primes[i-1])
# select this index or next index (no risk for out of range here)
primeword.append(chr(primes[i + int(delta2 > delta1) - 1]))
print("".join(primeword))
which gives me:
CCCCCGGGIIIIOOOOSSSSSYYYY
EDIT: since we generated the prime numbers, we could as well directly generate the lookup table for the characters and use str.translate
:
primeword_dict = {65: 'C', 66: 'C', 67: 'C', 68: 'C', 69: 'C', 70: 'G', 71: 'G', 72: 'G', 73: 'I', 74: 'I', 75: 'I', 76: 'I', 77: 'O', 78: 'O', 79: 'O', 80: 'O', 82: 'S', 83: 'S', 84: 'S', 85: 'S', 86: 'S', 87: 'Y', 88: 'Y', 89: 'Y', 90: 'Y'}
print(word.translate(primeword_dict))
that would be even faster & shorter, and allows to pass strings like "HELLO WORLD.
(with spaces & punctuation in it) and only have the letters changed & other symbols kept intact.