In order to parameterize your query using MySQLdb's cursor.execute
method, the second argument to execute
has to be a sequence of values; in your for
loop, you're joining the values together into one string with the following line:
row_data = ', '.join(str(value) for value in row.values())
Since you generated a number of placeholders for your values equal to len(row)
, you need to supply that many values to cursor.execute
. If you gave it only a single string, it will put that entire string into the first placeholder, leaving the others without any arguments. This will throw a TypeError
- the message in this case would read, "not enough arguments for format string," but I'm going to assume you simply mixed up when copy/pasting because the opposite case (supplying too many arguments/too few placeholders) reads as you indicate, "not all arguments converted during string formatting."
In order to run an INSERT
statement through MySQLdb with a variable set of columns, you could do just as you've done for the columns and placeholders, but I prefer to use mapping types with the extended formatting syntax supported by MySQLdb (e.g., %(name)s
instead of %s
) to make sure that I've constructed my query correctly and not put the values into any wrong order. I also like using advanced string formatting where possible in my own code.
You could prepare your inputs like this:
max_key_length = 64
columns = ','.join(k[:max_key_length] for k in row.keys())
placeholders = ','.join('%({})s'.format(k[:max_key_length]) for k in row.keys())
row_data = [str(v) for v in row.values()]
Noting that the order of the dict comprehensions is guaranteed, so long as you don't alter the dict in the meanwhile.
Generally speaking, this should work okay with the sort of code in your insert_mysql
function. However, looking at the JSON data you're actually pulling from that URL, you should be aware that you may run into nesting issues; for example:
>>> pprint.pprint(data[0])
{u'address_street_direction': u'W',
u'address_street_name': u'61ST',
u'address_street_number': u'424',
u'address_street_suffix': u'ST',
u'any_people_using_property_homeless_childen_gangs_': True,
u'community_area': u'68',
u'date_service_request_was_received': u'2014-06-02T00:00:00',
u'if_the_building_is_open_where_is_the_entry_point_': u'FRONT',
u'is_building_open_or_boarded_': u'Open',
u'is_the_building_currently_vacant_or_occupied_': u'Vacant',
u'is_the_building_vacant_due_to_fire_': False,
u'latitude': u'41.78353874626324',
u'location': {u'latitude': u'41.78353874626324',
u'longitude': u'-87.63573355602661',
u'needs_recoding': False},
u'location_of_building_on_the_lot_if_garage_change_type_code_to_bgd_': u'Front',
u'longitude': u'-87.63573355602661',
u'police_district': u'7',
u'service_request_number': u'14-00827306',
u'service_request_type': u'Vacant/Abandoned Building',
u'ward': u'20',
u'x_coordinate': u'1174508.30988836',
u'y_coordinate': u'1864483.93566661',
u'zip_code': u'60621'}
The string representation of the u'location'
column is:
"{u'latitude': u'41.78353874626324', u'needs_recoding': False, u'longitude': u'-87.63573355602661'}"
You may not want to put that into a database field, especially considering that there are atomic lat/lon fields already in the JSON object.