Questions tagged [spark-graphx]

GraphX is a component in Apache Spark for graphs and graph-parallel computation

GraphX is a component in Apache Spark for graphs and graph-parallel computation.

At a high level, GraphX extends the Spark RDD by introducing a new Graph abstraction: a directed multigraph with properties attached to each vertex and edge.

To support graph computation, GraphX exposes a set of fundamental operators (e.g., subgraph, joinVertices, and aggregateMessages) as well as an optimized variant of the Pregel API.

In addition, GraphX includes a growing collection of graph algorithms and builders to simplify graph analytics tasks.

487 questions
-1
votes
1 answer

Error while parallelizing a List[String] taken from Java and used in Scala to create a RDD

var g = Graphx.graph() // return an Object with 2 Lists : List edgeArray // List vertexArray val vertexRDD: RDD[(String)] = sc.parallelize(g.vertexArray) type mismatch; [error]…
testing
  • 183
  • 1
  • 2
  • 6
-1
votes
1 answer

Scala, get sum of multidimensional array

By using trianglecount from GraphX, I retrieve the following array: Array[(org.apache.spark.graphx.VertexId, Int)] = Array((1,1), (3,1), (2,1)) I'm trying to find a way to sum the second value of each element in the array. Thus the 1's in this…
-2
votes
1 answer

Fast file writing in scala?

So I have a scala program that iterates through a graph and writes out data line by line to a text file. It is essentially an edge list file for use with graphx. The biggest slow down is actually creating this text file, were talking maybe million…
-2
votes
1 answer

Need help in filtering records according to set of rules with Apache Spark

I need help in one of the usecases that I have encountered of filtering records against a set of rules with Apache Spark. As the actual data has too many fields, for example, you can think of data like below (for simplicity giving data in JSON…
aks
  • 1,019
  • 1
  • 9
  • 17
-2
votes
1 answer

How to test Spark GraphX new example SSSPExample.scala

A new Graphx example is posted on Apache's git hub repository at https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/graphx/SSSPExample.scala. The code includes a line: bin/run-example…
welli
  • 24
  • 3
-2
votes
1 answer

graphX cannot graph multiple Edges and Vertices?

my vertice type is: org.apache.spark.rdd.RDD[((Long, String), (Long, String), (Long, String))] my edge type is: org.apache.spark.rdd.RDD[(org.apache.spark.graphx.Edge[String],org.apache.spark.graphx.Edge[String])] When I tried to Graph(vertices,…
JY078
  • 393
  • 9
  • 21
-3
votes
1 answer

Implement custom algorithm In Graphframes

I want to run the biconnected graph algorithm on a graph using GraphFrames running with pyspark 2.3. I reaized that all the built in algorithms are running under the hood with GraphX in Scala. Does there is a way that I can implement the biconnected…
1 2 3
32
33