1

I have a table with .snappy.parquet extension.

data= 'part-001-36b4-7ea3-4165-8742-2f32d8643d-c000.snappy.parquet'

I would like to read this and I tried the following:

table = spark.read.load(data, format='delta')

When I try with the above syntaxy, I am getting the following error. AnalysisException: A partition path fragment should be the form like `part1=foo/part2=bar`. The partition path: part-001-36b4-7ea3-4165-8742-2f32d8643d-c000.snappy.parquet.

and

table = spark.read.parquet(data)

When I try with the above, I am getting this error: AnalysisException: Incompatible format detected.

Hiwot
  • 568
  • 5
  • 18

2 Answers2

1
df = spark.read.parquet('/path/where/file/is/')

Probably your parquet is generated with many parts, so you need to read all the path where parquet parts are generated

0

If you don't mind using pandas for this specific task, I've found success in the past reading snappy parquet files like this

import pandas as pd
df = pd.read_parquet(data)