I'm querying some data from SQL Server into Python using pyodbc and storing the results in a pandas dataframe. One of the resulting columns should be a date datatype but it is coming over as a object. I am trying to figure out how I can work with the dates in this column.
I was able to convert all columns to string and clean data, was able to convert quantity_picked to a float successfully, but was not able to successfully convert date_expected column to datetime64 (as suggested on other sites online) before OR after converting it to a string.
Error received:
NameError: name 'datetime64' is not defined
I have also tried simply converting the results using pd.to_datetime() and I receive the error below:
TypeError: <class 'pyodbc.Row'> is not convertible to datetime
I have had trouble finding any documentation on how to specify the data type while querying using pyodbc.
If I have missed something obvious, please excuse me, I am a new python user.
I am unsure if I should be worrying about getting this in a proper date datatype while querying, after querying and before cleaning, or after cleaning, or if it is even possible. If it is not possible, any guidance on how to deal with dates during analysis in python is appreciated.
This is how I am pulling the data from SQL Server to Python and putting it into a dataframe:
cursor = sql_conn.cursor()
SQL_query1 = cursor.execute('SELECT item_number FROM [Data].[MCF].[FULL ORDER DETAIL]')
pratice_df = pd.DataFrame()
practice_df1 = pd.DataFrame(SQL_query1, columns=['item_number']) #imports all values from query as one column
SQL_query2 = cursor.execute('SELECT quantity_picked FROM [Data].[MCF].[FULL ORDER DETAIL]')
practice_df2 = pd.DataFrame(SQL_query2, columns=['quantity_picked'])
SQL_query3 = cursor.execute('SELECT date_expec FROM [Data].[MCF].[FULL ORDER DETAIL]')
practice_df3 = pd.DataFrame(SQL_query3, columns=['date_expected'])
#combine separate dataframes into one dataframe with corresponding values
practice_df = pd.concat([practice_df1, practice_df2, practice_df3], axis=1)
Had trouble getting pyodbc to bring over data without putting all 3 columns as one column in a dataframe, so I brought columns over and entered into individual dataframes, and then merged them.
This is how the data is displayed (after cleaning):
0 0001111041660 1.0 2019-05-27
1 0001111046070 2.0 2019-05-27
If I enter practice_df.info() currently, I receive:
RangeIndex: 55676 entries, 0 to 55675
Data columns (total 3 columns):
item_number 55676 non-null object
quantity_picked 55676 non-null float64
date_expected 55676 non-null object
dtypes: float64(1), object(2)
Any guidance is much appreciated. Thank you.