SL No: Customer Month Amount
1 A1 12-Jan-04 495414.75
2 A1 3-Jan-04 245899.02
3 A1 15-Jan-04 259490.06
My Df is above
Code
import findspark
findspark.init('/home/mak/spark-3.0.0-preview2-bin-hadoop2.7')
import pyspark
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('mak').getOrCreate()
import numpy as np
import pandas as pd
# Enable Arrow-based columnar data transfers
spark.conf.set("spark.sql.execution.arrow.enabled", "true")
pdf3 = pd.read_csv('Repayment.csv')
df_repay = spark.createDataFrame(pdf3)
only loading df_repay
is having issue, other data frame are loaded successfully. When i shfted my above to code to below code its worked successfully
df4 = (spark.read.format("csv").options(header="true")
.load("Repayment.csv"))
- why df_repay is not loaded with
spark.createDataFrame(pdf3)
while similar data frames loaded successfully