0

I have a Python dictionary that I'm trying to insert into mysql. Problem is one of the Keys of the dictionary is longer that 64 characters (the max length of a column in mysql). So I need to truncate all dictionary keys to 64 characters.

The code below works in all respects except for the one key that is longer than 64 characters = location_of_building_on_the_lot_if_garage_change_type_code_to_bgd_

data = {'x_coordinate': '1158020.73068669',
    'any_people_using_property_homeless_childen_gangs_': True,
    'police_district': '8',
    'location_of_building_on_the_lot_if_garage_change_type_code_to_bgd_': 'Front',
    'service_request_number': '14-00630589',
    'address_street_suffix': 'AVE',
    'y_coordinate': '1866585.99638448',
    'date_service_request_was_received': '2014-05-01T00:00:00',
    'address_street_number': '5719',
    'longitude': '-87.69612590561026',
    'latitude': '41.78965826126179',
    'address_street_name': 'FRANCISCO',
    'address_street_direction': 'S',
    'location': {'latitude': '41.78965826126179', 'needs_recoding': False, 'longitude': '-87.69612590561026'},
    'service_request_type': 'Vacant/Abandoned Building',
    'community_area': '63',
    'is_the_building_currently_vacant_or_occupied_': 'Vacant',
    'ward': '16',
    'is_building_open_or_boarded_': 'Open',
    'is_the_building_vacant_due_to_fire_': True,
    'zip_code': '60629'}


placeholders = ', '.join(['%s'] * len(data))
columns = ', '.join(data.keys())
sql = "INSERT INTO vacant_buildings (%s) VALUES (%s)" % (columns, placeholders)

I tried to change:

columns = ', '.join(data.keys())

to

columns = ', '.join(data[:64].keys())

but get the following error: TypeError: unhashable type

Thoughts?

dfriestedt
  • 483
  • 1
  • 3
  • 18

3 Answers3

2

you want to truncate keys (=strings), not the data (which is a dictionary and has no "length" in the sense of "characters"):

columns = ', '.join(d[:64] for d in data.keys())
Pavel
  • 7,436
  • 2
  • 29
  • 42
1

Pavel's answer works fine, but in case you're worried about namespace collisions due to truncation

For example, location_of_building_on_the_lot_if_garage_change_type_code_to_bgd_ and location_of_building_on_the_lot_if_garage_change_type_code_to_bgd_hahaha would be different keys until you truncate them, at which point they are the same key.

keys = []
for k in data.keys():
    newKey = k[:64]
    count = 1
    while newKey in keys:
        alteration = str(count)
        newKey = newKey[:-len(alteration)] + alteration
        count += 1
    keys.append(newKey)

columns = ', '.join(keys)
Brionius
  • 13,858
  • 3
  • 38
  • 49
0

al tho .join() would solve the problem, it's a biter slower than doing:

columns = ''
for key in data.keys():
    columns += key[:64] +', '
sql = "INSERT INTO vacant_buildings (%s) VALUES (%s)" % (columns[:-2], placeholders)

This is because '.join()` will perform a iterating operation on a list you've already iterated over, doing the job manually will become a LOT quicker if you're working with huge sets of data.

Also note that x[:-2] is fine on small inserts, but if you bundle VALUES together to form a one-execution string like so:

INSERT INTO table VALUES (1, 2, 3), (2,2,3), (3,2,3) ...

doing a data[:-2] operator will become extremely slow in which a counter to check if you're on the last item in your list would be great and thus skipping +', ' at the end.

If you're going to strip out values as well, do this in one for loop instead of two:

for key, value in data.items():
    columns += key[:64] +', '

To be compliant witih future Python versions also switch to.format() intead of doing 'something (%s) something else' because this is obsolete.

>>> a = [1, 2, 'test']
>>> '{} is {} with {}'.format(*a)
'1 is 2 with test'

TL;DR:

Build your string manually instead of using multiple iterating functions resulting in the same outcome. And use .format()!!

Torxed
  • 22,866
  • 14
  • 82
  • 131