I have a networkx graph with events spanning several months. I wanted to see how a node's centrality score changes over time.
I am planning on using several different centrality measures so I have created a function to select a specific sender (I don't have many unique senders) and a specific date, then create a networkx graph and calculate the degree. Then add everything to a dataframe.
But my code seems to be a bit convoluted and I'm not sure it's working correctly, since my output:
feature degree date
0 A 1.0 2017-01-02
1 35 1.0 2017-01-02
0 A 1.0 2017-01-20
1 18 1.0 2017-01-20
contains nodes 35 and 18, but I only want A. Is there a better way of doing this?
import numpy as np
import pandas as pd
from datetime import datetime
import networkx as nx
df = pd.DataFrame({'feature':['A','B','A','B','A','B','A','B','A','B'],
'feature2':['18','78','35','14','57','68','57','17','18','78'],
'timestamp':['2017-01-20T11','2017-01-01T13',
'2017-01-02T12','2017-02-01T13',
'2017-03-01T14','2017-05-01T15',
'2017-04-01T16','2017-04-01T17',
'2017-12-01T17','2017-12-01T19']})
df['timestamp'] = pd.to_datetime(pd.Series(df['timestamp']))
df['date'], df['time']= df.timestamp.dt.date, df.timestamp.dt.time
def test(feature,date,name,col_name,nx_measure):
feature = df[df['feature']== feature]
feature['date_str'] = feature['date'].astype(str)
one_day = feature[feature['date_str']==date]
oneDay_graph =nx.from_pandas_edgelist(one_day, source = 'feature', target = 'feature2', create_using=nx.DiGraph)
name = pd.DataFrame()
name['feature']= nx_measure(oneDay_graph).keys()
name[col_name]= nx_measure(oneDay_graph).values()
name['date'] = date
return name
a =test('A','2017-01-02','degree','degree',nx.degree_centrality)
b = test('A','2017-01-20','degree','degree',nx.degree_centrality)
a.append(b)
desiered output
feature degree date
0 A 1.0 2017-01-02
0 A 1.0 2017-01-20