I'm doing a Social Network Analysis of this dataset with NetworkX and I want to make a degree and closeness centrality analysis.
The graph I obtain is undirected (graph.is_directed() returns false), and I have node 1 with degree 593 but it has 0 as target and weight on the edges csv. The graph is undirected so I expect node 1 to be the central node but it's not and I don't get why (the dataset is based on the animated series The Simpson so I bet I know who is the most central character).
I'm afraid the analysis ends up unreliable this way.
---edit
This is the code where I import and create the graph.
dfN=pd.read_csv('gdrive/My Drive/SNA/simpsonsNodes.csv')
dfE=pd.read_csv('gdrive/My Drive/SNA/simpsonsEdges.csv')
df = pd.merge(left=dfN, right=dfE, left_on="Id", right_on='Source', how='outer').drop(['Id', 'Type'], axis=1)
df.columns = ['Name', 'Source', 'Target', 'Weight']
df = df.fillna(0)
df = df.astype({'Source':'int', 'Target':'int', 'Weight':'int'})
df
graph = nx.from_pandas_edgelist(df, 'Source', 'Target', edge_attr='Weight', create_using=nx.Graph() )
print(graph.is_directed())
I dropped two columns: id because is redundant and Type because it's "Undirected" for every row so I don't really need it.
I used df = df.fillna(0) because node with id = 1 had source, target and weight as NaN so I converted it and used df.loc[0,"Source"]=1 to insert 1 as its source.