Questions tagged [spark-graphx]

GraphX is a component in Apache Spark for graphs and graph-parallel computation

GraphX is a component in Apache Spark for graphs and graph-parallel computation.

At a high level, GraphX extends the Spark RDD by introducing a new Graph abstraction: a directed multigraph with properties attached to each vertex and edge.

To support graph computation, GraphX exposes a set of fundamental operators (e.g., subgraph, joinVertices, and aggregateMessages) as well as an optimized variant of the Pregel API.

In addition, GraphX includes a growing collection of graph algorithms and builders to simplify graph analytics tasks.

487 questions
0
votes
1 answer

What is CSR indexing as a join optimization technique?

The Spark Graphx paper mentions CSR indexing in the following context: GraphX recasts system optimizations developed in the context of graph processing systems as join optimizations (e.g., CSR indexing, join elimination, and join-site…
WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560
0
votes
1 answer

In GraphX - is there a way to pattern match on an EdgeTriplet?

It extends Edge, which is a case class, but EdgeTriplet isn't, and it doesn't implement unapply. I wonder if there is a way to do pattern matching, e.g. t:EdgeTriplet[Foo,Bar] match { case EdgeTriplet(src, dst, edgeAttr) => ... }
Eran Medan
  • 44,555
  • 61
  • 184
  • 276
-1
votes
1 answer

How to convert dataframe to rdd in Zeppelin to use graphX

I want to use graphX in zeppelin with my dataframe First, my dataframe is as below. +---+-----+---+ | id| name|age| +---+-----+---+ | a| AA| 34| | b| BB| 36| | c| CC| 30| | d| DD| 29| | e| EE| 32| | f| FF| 36| | g| GG|…
-1
votes
1 answer

How to create a graph from an RDD/DF? Scala Spark

my RDD contains actually some biological data which is protein names, and the similarity degree between them. I would like to create graph where vertices are proteins and edges represent the similarity values. Here's actually my…
amelie
  • 25
  • 7
-1
votes
1 answer

Using Broadcast variable OR using RDD filter for computing intersection of two nodes neighbors?

i have used GraphLoader to load my graph into RDDs. each node in graph has some neighbors. the main goal is to find their intersection and do some parallel and distributed operations on them. each node at first has attribute 1 and i have changed…
user12989234
-1
votes
1 answer

Write data in neo4j using native spark API in JAVA

Is there any way to write data in Neo4J db using spark native API in java. Is like GraphFrame (org.graphframes.GraphFrame) available in spark-connector same as Neo4J graph and can we dump this in db. Though we tried the native neo4j API in spark…
User_qwerty
  • 375
  • 1
  • 2
  • 10
-1
votes
1 answer

how to add the count of followers in the data set(most active user)

i have a dataset of a social network contaning information about how follows how i need to find most active user(for example the user that dose most followings) my data set lines are like bellow …
AliSafari186
  • 113
  • 9
-1
votes
1 answer

Type mismatch when the type is already specified in scala

I am trying to use the code below in scala, using GraphX val vertexRDD: RDD[(VertexId, String)] = graph.vertices.filter({ case (id, (str)) => { val c: Boolean = scala.util.Try(str.toInt) match { case Success(_) =>…
Litchy
  • 355
  • 1
  • 4
  • 18
-1
votes
1 answer

Explain the connection between spark libraries, such as SparkSQL, MLib, GraphX and Spark Streaming

Explain the connection between libraries, such as SparkSQL, MLib, GraphX and Spark Streaming,and the core Spark platform
-1
votes
1 answer

scala.MatchError on a tuple

After processing some input data, I got a RDD[(String, String, Long)], say input, in hand. input: org.apache.spark.rdd.RDD[(String, String, Long)] = MapPartitionsRDD[9] at flatMap at :54 The string fields here represent vertices of graph…
S.K
  • 347
  • 1
  • 6
  • 20
-1
votes
1 answer

Spark `LiveListenerBus` Exception is freaking me out

I'm using AWS EMR clusters, and the spark version is spark-submit --version Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.0.1 /_/ Branch HEAD Compiled…
avocado
  • 2,615
  • 3
  • 24
  • 43
-1
votes
1 answer

Spark Scala - Joining two arrays by VertexID

I have 2 arrays in the following format scala> cPV.take(5) res18: Array[(org.apache.spark.graphx.VertexId, String)] = Array((-496366541,7804412), (183389035,11517829), (1300761459,36164965), (978932066,32135154), (370291237,40355685)) scala>…
SoakingHummer
  • 562
  • 1
  • 7
  • 25
-1
votes
1 answer

Converting an array of edges and vertices to a graph friedly format

I have extracted the links between the wikipedia pages in an RDD which has the following format: Array[(String, String)] = Array((AccessibleComputing,[Computer accessibility]), (Anarchism,[political philosophy, stateless…
ulrich
  • 3,547
  • 5
  • 35
  • 49
-1
votes
2 answers

Create Edges from Vertices with Spark

Lets say I have an array of vertices and I want to create edges from them in a way that each vertex connects to the next x vertices. x could have any integer value. Is there a way to do that with Spark? This is what I have with Scala so far: //array…
Al Jenssen
  • 655
  • 3
  • 9
  • 25
-1
votes
1 answer

Illegal Access error in GraphX

I'm using Spark and Graphx on IntelliJ IDEA for the first time. I'm trying to create a graph and running queries on it but I'm getting the following error: java.lang.IllegalAccessError: tried to access class org.apache.spark.util.collection.Sorter…
CMWasiq
  • 79
  • 10
1 2 3
32
33