I'm trying to analyze goal-scoring networks in hockey. I have data for the player who scored the goal and the player who assisted on that goal. My issue is that some goals do not have an assist, so I'm not sure what I should do in those situations.
So, an example for my data looks like this:
scorer <- c("Lidstrom", "Yzerman", "Fedorov", "Yzerman", "Shanahan")
assister <- c("", "Lidstrom", "Yzerman", "Shanahan", "Lidstrom")
mydata <- data.frame(scorer, assister)
And the output is:
scorer assister
1 Lidstrom
2 Yzerman Lidstrom
3 Fedorov Yzerman
4 Yzerman Shanahan
5 Shanahan Lidstrom
When I'm dealing with unassisted goals, does it make sense to act as if the assist goes to the scorer?
EX:
scorer assister
1 Lidstrom Lidstrom
2 Yzerman Lidstrom
3 Fedorov Yzerman
4 Yzerman Shanahan
5 Shanahan Lidstrom
Or does it make sense to create a new name "unassisted" for unassisted goals?
EX:
scorer assister
1 Lidstrom UNASSISTED
2 Yzerman Lidstrom
3 Fedorov Yzerman
4 Yzerman Shanahan
5 Shanahan Lidstrom
Here's the rest of my code for the PageRank, assuming that something is filled in for the blank assister space:
library(igraph)
library(dplyr)
my_network <- mydata %>%
as.matrix() %>%
graph.edgelist(directed = TRUE)
page_rank(my_network, directed = TRUE)$vector
I can't just remove goals that are unassisted, so I'm trying to come up with some solution that doesn't defy any major graph theory principles (of which I'm not knowledgeable). Any ideas?