0

I try to writing my dataframe with a column typed as "dense vector" in mongodb with mongo spark connector.

but i getting this error :

cannot cast [2.0,2.0,115.0,0.0,0.0,0.0,0.0,0.0] into a BsonValue. org.apache.spark.ml.linalg.VectorUDT@3bfc3ba7 has no matching BsonValue.

But why it is not cast as Array[Double], refer to :https://docs.mongodb.com/manual/reference/bson-types/

my dataframe schema :

root

|-- label: double (nullable = false)

|-- date: timestamp (nullable = true)

|-- features: vector (nullable = true)
Saurabh Srivastava
  • 1,093
  • 14
  • 27
harksin
  • 235
  • 2
  • 11
  • The Mongo Spark connector doesn't know what to do with the vector type. Its an internal `UserDefinedType` from the mllib library which the connector doesn't have a dependency on. You will have to convert it into an `Array[Double]` manually for it to save. – Ross Oct 10 '16 at 17:13
  • thanks ross :), maybe this can become a new feature ? – harksin Oct 11 '16 at 08:08
  • It would mean we'd have to take a dependency on the mllib library - so I'd prefer not to until its clear what they decide to do with UDT's. I would like to use them for the unsupported bson types but currently they are private to spark. – Ross Oct 11 '16 at 10:03

0 Answers0