2

Why {html, "доуч"++[1076,1086,1091,1095]} in yaws-page gives me next error:

Yaws process died: {badarg,[{erlang,list_to_binary,
                                    [[[[208,180,208,190,209,131,209,135,1076,
                                        1086,1091,1095]],
                                        ...

"доуч" = [1076,1086,1091,1095] -> gives me exact match, but how yaws translate 2-byte per elem list in two times longer list with 1 byte per elem for "доуч", but doesnt do it for [1076,1086,1091,1095]. Is there some internal represintation of unicode data involed?

I want to output to the web pages lists like [1076,1086,1091,1095], but it crushed.

Yola
  • 18,496
  • 11
  • 65
  • 106

2 Answers2

1

Erlang source files only support the ISO-LATIN-1 charset. The Erlang console can accept Unicode characters, but to enter them inside a source code file, you need to use this syntax:

K = "A weird K: \x{a740}".

See http://www.erlang.org/doc/apps/stdlib/unicode_usage.html for more info.

Kevin Albrecht
  • 6,974
  • 7
  • 44
  • 56
1

You have to do the following to make it work:

{html, "доуч"++ binary_to_list(unicode:characters_to_binary([1076,1086,1091,1095]))}

Why it fails?

In a bit more detail, the list_to_binary fails because it is trying to convert each item in the list to a byte, which it cannot do because each value in [1076,1086,1091,1095] would take more than a byte.

What is going on?

[1076,1086,1091,1095] is a pure unicode string representation of "доуч". Yaws tries to convert the string (list) into a binary string directly using list_to_binary and thus fails. Since each unicode character can take more than one byte, we need to convert it into a byte array. This can be done using:

unicode:characters_to_binary([1076,1086,1091,1095]). 
<<208,180,208,190,209,131,209,135>>

This can now be safely converted back and forth between list and binary representations. See unicode for more details.

You can convert back to unicode as follows:

unicode:characters_to_list(<<208,180,208,190,209,131,209,135>>).
[1076,1086,1091,1095]
mbsheikh
  • 2,501
  • 5
  • 23
  • 33