Expand record column from Delta Lake table in Power BI

Question

Example code to create the table in Apache Spark's PySpark shell:

from pyspark.sql.types import StructType,StructField, StringType, IntegerType, MapType

data2 = [({ "firstname2": "John", "lastname2": "Smith" }, "36636","M",3000),
    ({ "firstname2": "John2", "lastname2": "Smith2" }, "40288","M",4000)
    ]


schema = StructType([ \
    StructField("name", MapType(StringType(), StringType()), True), \
    StructField("id", StringType(), True), \
    StructField("gender", StringType(), True), \
    StructField("salary", IntegerType(), True) \
])

df = spark.createDataFrame(data=data2,schema=schema)

df.write.format("delta").option("path", "hdfs://[an_ip]:9000/data-warehouse/test2").mode("overwrite").saveAsTable("test2")

In Power BI Desktop, within "Get Data", we fetch the table, thanks to this Connector:

let
    Source = fn_ReadDeltaTable(Hdfs.Files("[an_ip]:50070/data-warehouse/test2"), [UseFileBuffer=true]),
    test2 = Table.ExpandTableColumn(Source, "name", {"firstname2", "lastname2"}, {"name.firstname2", "name.lastname2"})
in
    test2

But I got this issue:

If trying to use Table.ExpandRecordColumn() instead, here is another issue:

Although the issue can be workarounded by using (nested) StructType/StructField instead of MapType, I wonder if there is a solution in case we insist on using MapType.

What does the output of the fn_ReadDeltaTable () look like ? [show nested structure] — horseyride, Oct 10 '22 at 15:25
@horseyride This function `fn_ReadDeltaTable()` returns a table, like every other table. That's why we can use Power Query functions `Table.do_something()`. — JoyfulPanda, Oct 10 '22 at 16:02

Expand record column from Delta Lake table in Power BI

0 Answers0