I have the following RDD:
rdd.take(5) gives me:
[DenseVector([9.2463, 1.0, 0.392, 0.3381, 162.6437, 7.9432, 8.3397, 11.7699]),
DenseVector([9.2463, 1.0, 0.392, 0.3381, 162.6437, 7.9432, 8.3397, 11.7699]),
DenseVector([5.0, 20.0, 0.3444, 0.3295, 54.3122, 4.0, 4.0, 9.0]),
DenseVector([9.2463, 1.0, 0.392, 0.3381, 162.6437, 7.9432, 8.3397, 11.7699]),
DenseVector([9.2463, 2.0, 0.392, 0.3381, 162.6437, 7.9432, 8.3397, 11.7699])]
I would like to make it a data frame which should look like:
-------------------------------------------------------------------
| features |
-------------------------------------------------------------------
| [9.2463, 1.0, 0.392, 0.3381, 162.6437, 7.9432, 8.3397, 11.7699] |
|-----------------------------------------------------------------|
| [9.2463, 1.0, 0.392, 0.3381, 162.6437, 7.9432, 8.3397, 11.7699] |
|-----------------------------------------------------------------|
| [5.0, 20.0, 0.3444, 0.3295, 54.3122, 4.0, 4.0, 9.0] |
|-----------------------------------------------------------------|
| [9.2463, 1.0, 0.392, 0.3381, 162.6437, 7.9432, 8.3397, 11.7699] |
|-----------------------------------------------------------------|
| [9.2463, 2.0, 0.392, 0.3381, 162.6437, 7.9432, 8.3397, 11.7699] |
|-----------------------------------------------------------------|
Is this possible? I tried to use df_new = sqlContext.createDataFrame(rdd,['features'])
, but it didn't work. Does anyone have any suggestion? Thanks!