I'm using Spark 2.2 and i'm trying to read the JSON messages from Kafka, transform them to DataFrame
and have them as a Row
:
spark
.readStream()
.format("kafka")
.option("kafka.bootstrap.servers", "localhost:9092")
.option("subscribe", "topic")
.load()
.select(col("value").cast(StringType).as("col"))
.writeStream()
.format("console")
.start();
with this I can achieve:
+--------------------+
| col|
+--------------------+
|{"myField":"somet...|
+--------------------+
I wanted something more like this:
+--------------------+
| myField|
+--------------------+
|"something" |
+--------------------+
I tried to use from_json
function using struct
:
DataTypes.createStructType(
new StructField[] {
DataTypes.createStructField("myField", DataTypes.StringType)
}
)
but I only got:
+--------------------+
| jsontostructs(col)|
+--------------------+
|[something] |
+--------------------+
then I tried to use explode
but I only got Exception saying:
cannot resolve 'explode(`col`)' due to data type mismatch:
input to function explode should be array or map type, not
StructType(StructField(...
Any idea how to make this work?