I'm trying to import a parquet file in Databricks (pyspark) and keep getting the error
df = spark.read.parquet(inputFilePath)
AnalysisException: Column name "('my data (beta)', "Meas'd Qty")" contains invalid character(s). Please use alias to rename it.
I tried the suggestions in this post, using .withColumnRenamed
like in this post, and also using alias
like
(spark.read.parquet(inputFilePath)).select(col("('my data (beta)', "Meas'd Qty")").alias("col")).show()
but always get the same error. How do I go through each column to replace any invalid characters with underscore _
or even just delete all invalid characters?