I have a .txt file say list.txt which consists of list of source and destination URL in the format
google.de/2011/10/Extract-host link.de/2011/10/extact-host
facebook.de/2014/11/photos facebook.de/2014/11/name.jpg
community.cloudera.com/t5/ community.cloudera.com/t10/
facebook.de/2014/11/photos link.de/2011/10/extact-host
With the help of this post, How to create a VertexId in Apache Spark GraphX using a Long data type? I tried to create node and edges like :
val test = sc.textFile("list.txt") //running
val arrayForm = test.map(_.split("\t")) // running
val nodes: RDD[(VertexId, Option[String])] = arrayForm.flatMap(array => array).
map((_.toLong None))
val edges: RDD[Edge[String]] = arrayForm.
map(line => Edge(line(0), line(1), ""))
The problem here is I don't really know how to create VertexId and similarly edge from string datatype. Please let me know how to resolve this.