I have small question and issue which I hope spark gurus can help me in
I have parquet file person.parquet that has multiple column with one row. one of the column "Middle Name
" has space in the column name which cause issue with spark when writing it to parquet format
what i have done is to rename the column to remove the space as below
SourceData = SourceData.withColumnRenamed("Middle Name","MiddleName")
if i tried to write SourceData to parquet file, it still returns error
Caused by: org.apache.spark.sql.AnalysisException: Attribute name "Middle Name" contains invalid character(s) among " ,;{}()\n\t=". Please use alias to rename it.
so i use below which solve the issue
SourceData = spark.read.schema(SourceData.schema).parquet(TestingPath)
but unfortunately the file generated has null value for column MiddleName.
Any suggestion on how to solve this issue?