Edits:
- Following user's comments, added a failing address to this test. This address loads to the database without issue.
- Added code to store CSV addresses into MySQL.
Answer:
The code below performs the following actions:
- MySQL database
engine
(connection) created.
- Address data (number, address) read from CSV file.
- Non-field separating commas replaced from source data, and extra whitespace removed.
- Edited data fed into a
DataFrame
DataFrame
used to store data into MySQL.
import csv
import pandas as pd
from sqlalchemy import create_engine
# Set database credentials.
creds = {'usr': 'admin',
'pwd': '1tsaSecr3t',
'hst': '127.0.0.1',
'prt': 3306,
'dbn': 'playground'}
# MySQL conection string.
connstr = 'mysql+mysqlconnector://{usr}:{pwd}@{hst}:{prt}/{dbn}'
# Create sqlalchemy engine for MySQL connection.
engine = create_engine(connstr.format(**creds))
# Read addresses from mCSV file.
text = list(csv.reader(open('comma_test.csv'), skipinitialspace=True))
# Replace all commas which are not used as field separators.
# Remove additional whitespace.
for idx, row in enumerate(text):
text[idx] = [i.strip().replace(',', '') for i in row]
# Store data into a DataFrame.
df = pd.DataFrame(data=text, columns=['number', 'address'])
# Write DataFrame to MySQL using the engine (connection) created above.
df.to_sql(name='commatest', con=engine, if_exists='append', index=False)
Source File (comma_test.csv
):
"12345" , "123 abc street, Unit 345"
"10101" , "111 abc street, Unit 111"
"20202" , "222 abc street, Unit 222"
"30303" , "333 abc street, Unit 333"
"40404" , "444 abc street, Unit 444"
"50505" , "abc DR, UNIT# 123 UNIT 123"
Unedited Data:
['12345 ', '123 abc street, Unit 345']
['10101 ', '111 abc street, Unit 111']
['20202 ', '222 abc street, Unit 222']
['30303 ', '333 abc street, Unit 333']
['40404 ', '444 abc street, Unit 444']
['50505 ', 'abc DR, UNIT# 123 UNIT 123']
Edited Data:
['12345', '123 abc street Unit 345']
['10101', '111 abc street Unit 111']
['20202', '222 abc street Unit 222']
['30303', '333 abc street Unit 333']
['40404', '444 abc street Unit 444']
['50505', 'abc DR UNIT# 123 UNIT 123']
Queried from MySQL:
number address
12345 123 abc street Unit 345
10101 111 abc street Unit 111
20202 222 abc street Unit 222
30303 333 abc street Unit 333
40404 444 abc street Unit 444
50505 abc DR UNIT# 123 UNIT 123
Acknowledgement:
This is a long-winded approach. However, each step has been broken down intentionally to clearly show the steps involved.