0

Trying to print euro sign in browser: Prints successfully on terminal but not on browser Same behavior in python 2.7 and 3 : would prefer python 3.4 solution Browsers tested fire fox and opera: url localhost/cgi-bin/test2.py Browser shows page information with correct encoding so header must be working Some incompatibility perhaps with decode instruction in python Can produce Chinese characters by mixing encodings deliberately but cannot get them to match. running usual LAMP set up; no issues using PHP Seems to find correct binaries Need to accept input in any language

how to isolate issue?

Could someone post correct minimalist code for python 3 for headers and print say euro sign without using html entities please? My current code below

#!/usr//bin/env python3
import cgi
#cgi.test()

import locale
import sys
import os
import io

import codecs

import cgitb
cgitb.enable() #this does not work properly either!!!


lf = chr(10)
cr  = chr(13)

h = "Content-Type: text/html; charset=utf-8 "
#h.encode("ascii")
print(h)
print(' Cache-Control: "no-cache, no-store, must-revalidate"'.encode('utf-8'))
#print(' Pragma: no-cache')
#print(' Expires: 0')
print(cr)
print(lf)

print()
print(lf)
print(cr)
print('<DOCTYPE! html>')
print('<meta HTTP-EQUIV="content-type" CONTENT="text/html; charset=utf-8">')
print('<html><body>')
hw = "Hello World!"
hw.encode('utf-8')
#hw.encode('utf-16le')
print(hw)

euro = "&euro;"
euro.encode('utf-8')
#euro.encode('utf-16')
print(euro) #THIS PRINTS OKAY


u = chr(8364)
u=u'This string includes a \u20AC sign'
u.encode('utf-8')
#u.encode('utf-16le')
print(u) #THIS PRINTS IN TERMINAL, BUT NOT IN BROWSER AND GENERATES FATAL ERROR 

end = "end"
end.encode('utf-8')
#end.encode('utf-16')
print(end)



Terminal output:
Content-Type: text/html; charset=utf-8 
b' Cache-Control: "no-cache, no-store, must-revalidate"'

<DOCTYPE! html>
<meta HTTP-EQUIV="content-type" CONTENT="text/html; charset=utf-8">
<html><body>
Hello World!
&euro;
This string includes a € sign
end


Python 3.4.0 (default, Apr 11 2014, 13:05:18) 
[GCC 4.8.2] on linux
Fredo
  • 27
  • 3

2 Answers2

0

Probably not the best solution but the following at least works:

u = chr(8364)
#u='This string includes a \u20AC sign'
u=u+'This string includes a \u673A sign'  
out = ''

for ch in u:
    out = out+'&#'+str(ord(ch))+';' 
print(out)
Fredo
  • 27
  • 3
0

Python3 strings are unicode by default, but it seems that the console has to support unicode too. For example: print("€") works on the linux terminal, but not on the windows command line. Apparently Apache has a similar problem. You can try to send the bytes directly:

#!/usr/bin/python3

import sys
import cgitb
cgitb.enable()

print("Content-Type: text/html;charset=utf-8")
print()
sys.stdout.flush()
print(
    "<!DOCTYPE html>"
    "<html>"
    "<body>")
sys.stdout.buffer.write(bytes("€", "utf-8"))
print(
    "</body>"
    "</html>")

Or you could just use print("&euro;"):

#!/usr/bin/python3

import cgitb
cgitb.enable()

print("Content-Type: text/html;charset=utf-8")
print()
print(
    "<!DOCTYPE html>"
    "<html>"
    "<body>"
    "&euro;"
    "</body>"
    "</html>")

This is much saner.

You don't have to use the encode method like you did in your script. Of course it won't look right in the terminal, but your browser will display it correctly.

Keep in mind that you have to print an empty line to seperate the header from the rest. After that you just print regular html.

Largon
  • 16
  • 3
  • First option here is better as it allows any unicode char to be rendered correctly: must only use ascii with print() and then flush stdout – Fredo Nov 26 '14 at 00:53