Why does my code remove 999 in my replacement code?

Question

I have the code below to replace all punctuation with 999 and all alphabet characters with its number position. I have included the print statement that confirms punctuation is being replaced. However I seem to override with my remaining code to replace the other characters.

import string
def encode(text):
    punct = '''!()-[]{};:'"\,<>./?@#$%^&*_~'''
    for x in text.lower(): 
        if x in punct: 
            text = text.replace(x, ".999")
            print(text)
        nums = [str(ord(x) - 96) 
                for x in text.lower()
                    if x >= 'a' and x <= 'z'
                    ]
    return ".".join(nums)
print(encode(str(input("Enter Text: "))))

Input: 'Morning! \n'

Output: '13.15.18.14.9.14.7 \n'

Expected Output: 13.15.18.14.9.14.7.999

Could you give at least one actual example of your input, received output, and expected output? I have no idea what problem I should be looking for here. — jasonharper, Sep 04 '20 at 23:29
Try to come up with a *separate* function that, given a **single character** of the input, tells you the **integer value** that it should represent. Then try to use that separate function to create the text you want, all at once. — Karl Knechtel, Sep 04 '20 at 23:39
As an aside: You *do not need* to `import string` to use string methods - but you should probably be using it for the *constant strings* it provides, such as `string.punctuation` which is equal to `'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'`. Also, `input` returns a string already; it accomplishes nothing to use `str` on the result. — Karl Knechtel, Sep 04 '20 at 23:57

score 0 · Answer 1 · answered Sep 04 '20 at 23:35

No, you have two independent logical "stories" here. One replaces punctuation with 999. The other filters out all the letters and builds an independent list of their alphabetic positions.

    nums = [str(ord(x) - 96) 
            for x in text.lower()
                if x >= 'a' and x <= 'z'
                ]
return ".".join(nums)

Note that this does nothing to alter text, and it takes nothing but letters from text. If you want to include the numbers, do so:

    nums = [str(ord(x) - 96) 
               if x >= 'a' and x <= 'z'
               else x
                  for x in text.lower()              
                ]
return ".".join(nums)

Output of print(encode("[hello]")):

..9.9.9.8.5.12.12.15...9.9.9

Karl Knechtel · Answer 2 · 2020-09-04T23:52:24.107

    nums = [str(ord(x) - 96) 
            for x in text.lower()
                if x >= 'a' and x <= 'z'
                ]

This means: take every character from the lowercase version of the string, and only if it is between 'a' and 'z', convert the value and put the result in nums.

In the first step, you replace a bunch of punctuation with text that includes '.' and '9' characters. But neither '9' nor '.' is between 'a' and 'z', so of course neither is preserved in the second step.

Now that I understand what you are going for: you have fundamentally the wrong approach to splitting up the problem. You want to separate the two halves of the rule for "encoding" a given part of the input. But what you want to do is separate the whole rule for encoding a single element, from the process of applying a single-element rule to the whole input. After all - that is what list comprehensions do.

This is the concept of separation of concerns. The two business rules are part of the same concern - because implementing one rule doesn't help you implement the other. Being able to encode one input character, though, does help you encode the whole string, because there is a tool for that exact job.

We can have a complicated rule for single characters - no problem. Just put it in a separate function, so that we can give it a meaningful name and keep things simple to understand. Conceptually, our individual-character encoding is a numeric value, so we will consistently encode as a number, and then let the string-encoding process do the conversion.

def encode_char(c):
    if c in '''!()-[]{};:'"\,<>./?@#$%^&*_~''':
        return 999
    if 'a' <= c.lower() <= 'z':
        return ord(c) - 96
    # You should think about what to do in other cases!
    # In particular, you don't want digit symbols 1 through 9 to be
    # confused with letters A through I.
    # So I leave the rest up to you, depending on your requirements.

Now we can apply the overall encoding process: we want a string that puts '.' in between the string representations of the values. That's straightforward:

def encode(text):
    return '.'.join(str(encode_char(c)) for c in text)

Why does my code remove 999 in my replacement code?

2 Answers2