Python, output º symbol in utf-8 within HTTP while.write

Question

As stated, trying to output the degree symbol to a HTTP server. When I do this the output is Â° with the unwanted A character

The current code I am using to output this is:

self.wfile.write(bytes(("<br>Average Temperature in last 24 hours = "+str(round(totalTempPast/pastCount,2))+" \N{DEGREE SIGN}C"),"utf-8"))

Which gives the result Average Temperature in last 24 hours = 16.04 Â°C

I have tried using chr(223) and \xb0 (within a string) too which has the same result.

How would I go about fixing this?

Thanks

Rather than trying to force HTML to accept weird characters, you might be better off using the html escape, which is `°` (five characters total). — Tom Karzes, Jul 17 '23 at 00:46
It seems you are writing the file correctly. So I assume the problem is that the webpage is read as Latin-1 (which it is a problem, now all web should use UTF-8, but for old-compatibility, so old pages) — Giacomo Catenazzi, Jul 17 '23 at 09:54

score 2 · Answer 1 · answered Jul 17 '23 at 03:47

In UTF-8, the degree sign (Unicode U+00B0) is encoded as C2 B0. The "Â" letter that got printed is the Unicode U+00C2 character. It looks like Python prints the UTF-8 bytes as desired, but the browser did not interpret it correctly.

When you don't specify what encoding to use in an HTML document, the browser will take the liberty to guess/auto-detect one and use it. In your case, they took the wrong guess (e.g. guessing it as Latin-1) and so the weird character got displayed.

There are some alternative solutions:

As pointed out by @Tom Karzes in the comment, use the HTML escapes ° or °
Specify the encoding of your HTML document. It is a best practice to always do this. There are several different ways to do it, here are some for summary:

a. The meta charset tag

Add the following snippet

<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8" />
    ...
    </head>
    ...
</html>

b. Content-Type HTTP header

Add the following HTTP header to your HTTP response:

Content-Type: text/html; charset=utf-8

Once the encoding is specified, the same python code should be interpreted correctly by the browser and the undesired character wouldn't be displayed.

Python, output º symbol in utf-8 within HTTP while.write

1 Answers1