I am very much new to spark and was trying to check the DAG creation in spark UI using :
when I am reading the simple csv file using the command
val df = spark.read.format("csv").option("header", "true").load("/home/user/test.csv") then in spark only 1 STAGE is creating with DAG as :
and I am not understanding what and why it is "MAP" > "MAPPARTITIONSINTERNAL" >"WHOLESTAGECODEGEN"
and when I am running the command including the "inferschema" options to TRUE then 2 STAGES are creating :
spark.read.format("csv").option("header", "true").option("inferSchema", true).load("/home/user/test.csv") with each stage has theirown DAG
STAGE 1 DAG :
Can any body please help me ,why it is creating two stages when inferschema is TRUE and from where I can get the terms elaboration as mentioned in stages like "DESERIALIZETOOBJECT">"MAP" etc.
Waiting for the valuable inputs which can help to understand the DAG in detail ,regarding why in JOB 7 it is doing multiple "MAP PARTITIONS" then "DESERIALIZETOOBJECT" then "WHOLESTAGECODEGEN" and then again in JOB 8 it is doing "MAP" >"MAPPARTITIONSINTERNAL" > "WHOLESTAGECODEGEN"