I am trying to read data from a SQL Server database into a Polars DataFrame using Python. I have successfully used the pandas read_sql() method with a connection string in the past, but I am having trouble finding documentation on how to do this with Polars.
I am using Python and have installed the latest versions of Polars, Pandas, Connectorx, and PyArrow. I have created a connection string to my SQL Server database and successfully executed a SQL query using pandas read_sql() method to get a pandas DataFrame. I then attempted to convert this pandas DataFrame to a Polars DataFrame using the from_pandas() method, but I am getting the following error message:
"ModuleNotFoundError: pd.Series requires 'pandas' module to be installed"
I am confused because I have confirmed that all the necessary dependencies are installed, including Pandas. I am not sure what is causing this error or how to fix it.
import pyodbc
import polars as pl
from sqlalchemy import create_engine
from sqlalchemy.engine.url import URL
import pandas as pd
# define connection string
conn_str = (
r"DRIVER={SQL Server};"
r"SERVER=PLA1SQL01\AAMGRID1PRD;"
r"DATABASE=NIER;"
r"Trusted_Connection=yes;"
)
# create pyodbc connection
conn = pyodbc.connect(conn_str)
query = '''
SELECT DISTINCT TOP 4
rrs.Name as 'Risk Run Name',
rds.id as RiskDataSetID
FROM nier..RiskDataSet rds
LEFT JOIN nier..RiskResultSet rrs ON rds.id = rrs.RiskDataSetID
WHERE ismonthly = 0
AND ParentID IS NULL
AND NoLossForCMLCorp = 1
ORDER BY 2 DESC
'''
df = pd.read_sql(query, conn)
pl_df = pl.from_pandas(df)
I expected the code to successfully convert the pandas DataFrame into a Polars DataFrame, but I received the "ModuleNotFoundError" instead. Any clue on how to read sql query from ms sql server using polars?
import polars
import pyarrow
import pandas
import connectorx
import pyodbc
pl.show_versions()
---Version info---
Polars: 0.16.16
Index type: UInt32
Platform: Windows-10-10.0.19044-SP0
Python: 3.9.9 (tags/v3.9.9:ccb0e6a, Nov 15 2021, 18:08:50) [MSC v.1929 64 bit (AMD64)]
---Optional dependencies---
numpy: 1.24.2
pandas: 1.5.3
pyarrow: 11.0.0
connectorx: 0.3.1
deltalake: <not installed>
fsspec: <not installed>
matplotlib: <not installed>
xlsx2csv: <not installed>
xlsxwriter: <not installed>