2

I'm getting Unicode errors while trying to add data to Neo4J via Bulbs when the data contains non-ascii data.

the following code fails:

from bulbs.model import Node
from bulbs.property import String
from bulbs.neo4jserver import Graph

class User(Node):
    element_type="user"
    name = String(nullable=False)

g = Graph()
g.add_proxy("users", User)

user_data = {u'name': u'Aname M\xf6ller'}

g.users.create(**user_data)

with a UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 7: ordinal not in range(128)

The error is occurring in the bulbs.utils.u function, via codecs.unicode_escape_decode().

Some hopefully relevant info:

$>python -V
'2.7.3'

>>>type(user_data['name'])
type('unicode')

>>>import bulbs
>>>bulbs.__version__
'0.3'

The Neo4J docs state that all String objects are saved as unicode, so why is my unicode data not being accepted? I hope I am just missing something silly.

jjaderberg
  • 9,844
  • 34
  • 34
SunPowered
  • 706
  • 4
  • 17

2 Answers2

1

After refreshing myself on Python and unicode, I got around the problem, though only by wrapping my problem line with a try, catch, and then encoding the problem data to 'utf-8'. Not the most elegant solution, but the following code seems to be working for me.

from bulbs.model import Node
from bulbs.property import String
from bulbs.neo4jserver import Graph

class User(Node):
    element_type="user"
    name = String(nullable=False)

g = Graph()
g.add_proxy("users", User)

user_data = {u'name': u'Aname M\xf6ller'}

try:
    g.users.create(**user_data)
except UnicodeEncodeError:
    for k, v in user_data.iteritems():
        try:
            user_data[k] = unicode.encode(v, 'utf-8')
        except TypeError:
            # Fails for non string values
            pass
    g.users.create(**user_data)

The only issue I have with this. If the bulbs logger is active, then the error msg with a traceback are logged on the first call to create(). Not a deal breaker, just a bit annoying.

Haven't tried this on Python 3, any one have something to chime in on the matter?

SunPowered
  • 706
  • 4
  • 17
  • 1
    I have the suspicion that `bulbs.utils`'s function `u` is handling encodings improperly. I filled an `issue` at their repo at github. Lets see. This is not a natural way to use the api for such a common feature (i18n). – Paulo Bu Nov 09 '13 at 02:15
  • I agree that this may be a bug in `bulbs` by using the aforementioned `codecs.unicode_escape_decode`, which as you point out in the Github Issue, is not documented at all in the python docs. Thanks for the help @PauloBu – SunPowered Nov 09 '13 at 14:31
  • Yes, this was a bug. It's fixed in Bulbs 0.3.23 https://github.com/espeed/bulbs/commit/7f104cdbc30f27ea76b036cfa0d0a694f074153e Thanks. – espeed Nov 11 '13 at 12:25
1

Yes, this was a bug. It's fixed in Bulbs 0.3.23:

https://github.com/espeed/bulbs/commit/7f104cdbc30f27ea76b036cfa0d0a694f074153e

espeed
  • 4,754
  • 2
  • 39
  • 51