If i understand your question correctly, there are two parts to it:
- a concern about the presence of C-style Unicode escapes in your string, and
- How to handle the apostrophe like character in "it’s".
Your question indicates that you are using Python 3.8.10 and Ubuntu, so your ecosystem will be using Unicode (UTF-8), so there shouldn't be a need to use encode/decode pairs if your string is "The fox's color was \u201Cbrown\u201D and it’s speed was quick".
sample_text = "The fox's color was \u201Cbrown\u201D and it’s speed was quick"
print(sample_text)
# The fox's color was “brown” and it’s speed was quick
I'm using macOS (and thus musl libc) rather than Ubuntu (and glibc) but the behaviour should be the same.
For Python, the escaped character is the same as the actual character, so:
import unicodedata as ud
print('\u201C' == '“')
# True
print(ud.name("\u201C"))
# LEFT DOUBLE QUOTATION MARK
print(ud.name('“'))
# LEFT DOUBLE QUOTATION MARK
If you avoid the encode/decode pairs then it should resolve your second problem.
Although your string has other issues. Looking at words in your string:
fox's uses U+0027 (APOSTROPHE),
“brown” uses U+201C (LEFT DOUBLE QUOTATION MARK) and U+201D (RIGHT DOUBLE QUOTATION MARK), and
it’s uses U+2019 (RIGHT SINGLE QUOTATION MARK)
You are using U+0027 and U+2019 for the same purpose. It would be useful to cleanup the string. Since you are using smart quotes elsewhere:
sample_text = sample_text.replace('\u0027', '\u2019')
print(sample_text)
# The fox’s color was “brown” and it’s speed was quick
You discuss the need to get the original text representation of your string. Your string may be the original, as it is. The fact that you are using smart double quotes, would imply that your apostrophe/right single quotes should probably be right single quotes to match the smart double quotes. What the original string is, would be a combination of what keystrokes were used, and what editing controls were used to create the original string. But that takes you down a complex rabbit hole.
It would be a cleaner approach to think in terms of normalising your string, i.e. choosing a preferred Unicode character for apostrophe like characters. That is the approach I took above, using str.replace() to normalise the string using smart quotes consistently in the string. Obviously your could normalise away from smart quotes to the Basic Latin (ASCII) quotes:
sample_text = sample_text.replace('\u2019', '\u0027').replace('\u201C', '"').replace('\u201D', '"')
print(sample_text)
# The fox's color was "brown" and it's speed was quick