0

I've Been Trying To Create A System Which Turns A Table Of 1's And 0's To A Braille Character But It Keeps Giving Me This Error

File "brail.py", line 16 stringToWrite=u"\u"+brail([1,1,1,0,0,0,1,1]) ^ SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape

My Current Code Is

def brail(brailList):
    if len(brailList) == 8:
        brailList.reverse()
        brailHelperList=[0x80,0x40,0x20,0x10,0x8,0x4,0x2,0x1]
        brailNum=0x0
        for num in range(len(brailList)):
            if brailList[num] == 1:
                brailNum+=brailHelperList[num]
        stringToReturn="28"+str(hex(brailNum))[2:len(str(hex(brailNum)))]
        return stringToReturn
    else:
        return "String Needs To Be 8 In Length"

fileWrite=open('Write.txt','w',encoding="utf-8")
stringToWrite=u"\u"+brail([1,1,1,0,0,0,1,1])
fileWrite.write(stringToWrite)
fileWrite.close() 

It Works When I Do fileWrite.write(u"\u28c7") But When I Do A Function Which Should Return That Exact Same Thing It Errors.

Image Of Code Just In Case

mhawke
  • 84,695
  • 9
  • 117
  • 138
  • Can you fix the indentation of your code? If the entire code block that you posted is part of your function, everything including and after ```fileWrite=open()``` will not run (it is after a return statement). – goalie1998 Feb 07 '21 at 07:29
  • Sorry I fixed the indentation for some reason stackoverflow messed it up – Joshua66252 Feb 07 '21 at 07:37

3 Answers3

1

\u is the unicode escape sequence for Python literal strings. A 4 hex digit unicode code point is expected to follow the escape sequence. It is a syntax error if the code point is missing or is too short.

>>> '\u28c7'
'⣇'

>>> '\u'
  File "<stdin>", line 1
    '\u'
        ^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape

If you are using Python 3 then the u string prefix is not required as strings are stored as unicode internally. The u prefix was maintained for compatibility with Python 2 code.

That's the cause of the exception, however, you don't need to construct the unicode code point like that. You can use the ord() and chr() functions:

    from unicodedata import lookup
    braille_start = ord(lookup('BRAILLE PATTERN BLANK'))
    return chr(braille_start + brailNum)
mhawke
  • 84,695
  • 9
  • 117
  • 138
  • Is There Any Way To Make It So That It Can Take The Four Characters From The Function, It Would Be Very Helpful As I Need It To Make The Unicode Character From The Set Of Numbers Provided. – Joshua66252 Feb 07 '21 at 07:43
  • @Joshua66252: see my update that requires no string manipulation. – mhawke Feb 07 '21 at 07:54
  • Do You Know How To Mark An Answer As Correct? Edit: Wait Figured It Out – Joshua66252 Feb 07 '21 at 07:57
  • Wait Sorry To Mark This As Incorrect Again But When I Feed 10299 Into The chr Function It Outputs та╗ When It Should Output ⠻ Do You Know How I Can Fix This? @mhawke – Joshua66252 Feb 07 '21 at 08:31
  • It's working for me so it's probably a display issue with your terminal. What is your terminal encoding? `print(sys.getdefaultencoding())`. You can check the value with: `unicodedata.name(chr(10299))` is BRAILLE PATTERN DOTS-12456 – mhawke Feb 07 '21 at 08:55
  • As I Am Writing To A File I Don't Think The Terminal Is The Issue. Whats Your Python Ver. Mines Python 3.8.5 Edit: I Just Looked And My Python Is Outdated I'm Going To Update And See If That Fixes It – Joshua66252 Feb 07 '21 at 09:03
  • I Updated Python To Python 3.9.1 And It Still Outputs та╗ Whats Your Python Ver – Joshua66252 Feb 07 '21 at 09:13
  • OK, good point. It's not your Python version. – mhawke Feb 07 '21 at 09:14
  • Just To Make Sure Though Can You Please Send Your Python Ver Just In Case It Is Something To Do With Version – Joshua66252 Feb 07 '21 at 09:16
  • 3.9. You don't have to capitalise all of your words :). – mhawke Feb 07 '21 at 09:17
  • This is my current code btw https://pastebin.com/raw/K7UtyMYE just in case its something I'm messing up on EDIT: Sorry for capping every word it's a bad habit that I have. – Joshua66252 Feb 07 '21 at 09:20
  • How are you viewing the output file? Because you are converting to UTF8 the unicode characters will be output as a UTF8 byte stream which is a sequence of 3 bytes for code point 10299. It will be the terminal encoding that is not decoding/displaying the file correctly. – mhawke Feb 07 '21 at 09:21
  • I am viewing the outputted file with Notepad++ – Joshua66252 Feb 07 '21 at 09:22
  • I Just Opened It In A Hex Editor And The Hex Is E2 A0 BB Its Probably Something To Do With The Python chr Function Not What I'm Viewing It With. – Joshua66252 Feb 07 '21 at 09:38
  • No, that is the correct byte sequence. `b'\xe2\xa0\xbb'.decode('utf8')` is `⠻`. Make sure that you open it as UTF8 in Notepad. – mhawke Feb 07 '21 at 09:41
  • I Thought 283B Was ⠻ Not E2 A0 BB Also Notepad++ Not Notepad – Joshua66252 Feb 07 '21 at 09:45
  • Also I Just Checked And The Encoding Is Already Set To UTF-8 – Joshua66252 Feb 07 '21 at 09:46
  • 0x283B is the Unicode code point for BRAILLE PATTERN DOTS-12456. E2 A0 BB is the UTF8 encoding of that code point. Try this to aid your understanding: `print(chr(0x283B))` and `print(chr(0x283B).encode('utf8'))` – mhawke Feb 07 '21 at 09:52
  • Is There Any Way I Can Make it Write To The TXT As A Unicode Code Point Instead Of A UTF8 Encoding? – Joshua66252 Feb 07 '21 at 09:55
  • Yes. Set the encoding to UTF16. You can read up on the basics of Unicode, here's one place to start https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/ – mhawke Feb 07 '21 at 09:58
  • Thank You I'll Respond With The Link To The Finished Code Once I'm Done So That Other People Can See This Post If They Need It. – Joshua66252 Feb 07 '21 at 10:00
  • Oh I Literally Just Changed The Encoding From The Code I Posted Earlier To UTF-16 And It Started Working Perfectly Without Any Of The Issues From Earlier I'm Going To Post It As An Answer. – Joshua66252 Feb 07 '21 at 11:00
0

You can rewrite

stringToWrite=u"\u"+brail([1,1,1,0,0,0,1,1])

as

stringToWrite="\\u{0}".format(brail([1, 1, 1, 0, 0, 0, 1, 1]))

All strings are unicode in Python 3, so you don't need the leading "u".

goalie1998
  • 1,427
  • 1
  • 9
  • 16
-1
def braille(brailleString):
    brailleList = []
    brailleList[:0]=brailleString
    if len(brailleList) > 8:
        brailleList=brailleList[0:8]
    if len(brailleList) < 8:
        while len(brailleList) < 8:
            brailleList.append('0')
    brailleList1=[
    int(brailleList[0]),
    int(brailleList[1]),
    int(brailleList[2]),
    int(brailleList[4]),
    int(brailleList[5]),
    int(brailleList[6]),
    int(brailleList[3]),
    int(brailleList[7]),
    ]
    brailleList1.reverse()
    brailleHelperList=[128,64,32,16,8,4,2,1]
    brailleNum=0
    for num in range(len(brailleList1)):
        if brailleList1[num] == 1:
            brailleNum+=brailleHelperList[num]
    brailleStart = 10240
    return chr(brailleStart+brailleNum)
fileWrite=open('Write.txt','w',encoding="utf-16")
fileWrite.write(braille('11111111'))
fileWrite.close()

# Think Of The Braille Functions String Like It Has A Seperator In The Middle And The 1s And 0s Are Going Vertically