I am using the below script to read data from MSSQL
Server to Pyspark
dataframes.
DFFSA = spark.read.format("jdbc").option("url", jdbcURLDev).option("driver", MSSQLDriver ).option("dbtable", "FSA.dbo.FSA").option("user", "DevUser").option("password", "password").load();
This generates a Pyspark dataframe. How can I do so with pandas dataframe? I know I can convert the resulting dataframe into a pandas dataframe using the toPandas()
function but this is taking a lot of time since I am reading millions of rows.