-1

I have a large data which is composed of one million lines, which makes up a bipartite network. One side of the network represents the APP, the other side represents the IP.

The data format is :

1     1.1.1.1  

1     1.2.1.1

2     1.1.1.1

1     1.3.1.1

2     1.2.1.1 

What I would like to do is to write something by igraph(python interface) to project the data to one side. For instance

1.1.1.1  1.2.1.1 the weight = 2

1.1.2.1  1.3.1.1 the weight = 1

the weight represents 1.1.1.1 node shares one common APP (1) and common APP(2) with 1.2.1.1

And I want to save the weight in a file with the format of txt

I am a bit confused how to handle this with igraph.

Can igraph handle this problem?

Thanks

GsM
  • 161
  • 1
  • 1
  • 13

1 Answers1

3

It can be done with something like:

from igraph import Graph

def looks_like_ip_address(label):
    return "." in label

g = Graph.Read_Ncol("your-input-file.txt")
g.vs["type"] = [looks_like_ip_address(name) for name in g.vs["name"]]
one, other = g.bipartite_projection()
the_projection_you_need = other

Bipartite graphs are assumed to have a boolean type attribute in igraph and the functions simply assume that vertices with type = False belong to one side of the graph and vertices with type = True belong to the other. Therefore, we first load your graph and then set up the types manually with a simple rule of thumb: when the vertex label contains a dot, it is assumed to belong to the type = True side. Then we simply make both projections and discard one of them. You can get the weights from the projection with the following expression:

the_projection_you_need.es["weight"]

Update: depending on your graph, it might happen that one of the projections (the one that you don't need) is too large and it doesn't fit into memory while the other one would fit. g.bipartite_projection() has a which keyword argument that lets you specify which projection you need, so you can do this:

the_projection_you_need = g.bipartite_projection(which=True)
Tamás
  • 47,239
  • 12
  • 105
  • 124
  • and I find the weight is not the value that i need.maybe my example is ambiguous. I have modified my example where the weight is the common APP that two IP share, exactly app1 and app2 represent the weight =2 and simply app1 represents the weight =1 – GsM May 19 '14 at 07:08
  • It seems my dataset is so large (**line 67000**) When I run the projection. It runs slowly and 12 minuates later something error occurred to the python codes. The error is: `Assertion failed: v->stor_begin != NULL, file d:\build\igraph\igraph-0.7.0-msvc9\src\vector.pmt, lin e 479 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.` How can I handle it?Thanks – GsM May 19 '14 at 08:56
  • To be honest, I don't understand why the weight of the edge between `1.1.1.1` and `1.2.1.1` should be `2` in your example. Clearly, `1.1.1.1` and `1.2.1.1` have only one app in common (`1`), so the weight of the edge should be 1. If this is not what you want, then this is not a bipartite projection and you'll have to do things "manually". – Tamás May 20 '14 at 13:50