I have a large SAS dataset in sas7bdat format and I am converting it into a .db file using Pandas and SQLite3 with following code.
df = pd.read_sas('file.sas7bdat')
con = sqlite3.connect('file.db')
df.to_sql(name='file', con=con, if_exists = 'replace', index=False)'
The conversion works fine but the process adds extra characters everywhere it sees string variables (before and after the string). So if it sees (B010), it converts it to (b'B010'). I am using pandas later to strip those characters like this -
df['column'].map(lambda x: str(x)[2:-1])
But there are too many columns with such errors, so is there a way to fix this in conversion process itself?