I have a matrix, which represents mobility between various jobs:
jobdat <- matrix(c(
295, 20, 0, 0, 0, 5, 7,
45, 3309, 15, 0, 0, 0, 3,
23, 221, 2029, 5, 0, 0, 0,
0, 0, 10, 100, 8, 0, 3,
0, 0, 0, 0, 109, 4, 4,
0, 0, 0, 0, 4, 375, 38,
0, 18, 0, 0, 4, 26, 260),
nrow = 7, ncol = 7, byrow = TRUE,
dimnames = list(c("job 1","job 2","job 3","job 4","job 5","job 6","job 7"),
c("job 1","job 2","job 3","job 4","job 5","job 6","job 7")))
This is treated as a directed, weighted adjacency matrix in a social network analysis. The direction of the network is from rows to columns: So mobility is defined as going from a job-row to a job-column. The diagonal is meaningful, since it is possible to change to the same job in another firm.
For part of my analysis I want to select a submatrix which consists of job 1, job 5 and job 7:
work.list <- c(1,5,7)
jobpick_wrong <- jobdat[work.list,work.list]
however, this only gives the direct ties between these three jobs. What I need is this:
jobpick_right <- matrix(c(
295, 20, 0, 5, 7,
45, 3309, 0, 0, 3,
0, 0, 109, 4, 4,
0, 0, 4, 375, 38,
0, 18, 4, 26, 260),
nrow = 5, ncol = 5, byrow = TRUE,
dimnames = list(c("job 1","job 2","job 5","job 6","job 7"),
c("job 1","job 2","job 5","job 6","job 7")))
Here, job 2 and 6 are also included, since these two jobs also have direct ties to either job 1, 5 or 7. While job 3 and 4 are excluded, because they do not have any ties to job 1, 5 or 7.
I'm not sure how to go about this. Maybe I have to transform it into an igraph-object in order to get anywhere?
net <- graph.adjacency(jobdat, mode = "directed", weighted = TRUE)
and then maybe use the ego/neighborhood-function, also from the igraph package? But how I'm really not sure how. Or if this is the best way to go about it.
Thank you for your time,
Emil Begtrup-Bright
Augmented question:
The answer by aichao is perfect for the question asked, although it turns out that another step is needed. When the work.list has been created that include the jobs that has ties to the three "jobs of interest", job 1, 5, 7 in this example. Then, with real data, the amount of clutter makes another step desirable: That only the direct ties to and from the three jobs of interest are kept, while ties between other jobs are set to zero.
The data above does not depict this in a very good way, so I have created a very version of the above to demonstrate this:
jobdat <- matrix(c(
1, 0, 1, 0, 0, 0, 0,
1, 1, 1, 0, 0, 0, 0,
1, 1, 1, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 1
),
nrow = 7, ncol = 7, byrow = TRUE,
dimnames = list(c("job 1","job 2","job 3","job 4","job 5","job 6","job 7"),
c("job 1","job 2","job 3","job 4","job 5","job 6","job 7")))
by using aichaos solution:
work.list <- sort(unique(unlist(lapply(work.list, function(x) which(jobdat[x,] != 0)))))
then we get this:
jobdat[work.list,work.list]
# job 1 job 2 job 3 job 5 job 7
# job 1 1 0 1 0 0
# job 2 1 1 1 0 0
# job 3 1 1 1 0 0
# job 5 0 0 0 1 0
# job 7 0 0 0 0 1
However, the ties between job 2 and job 3 are irrelevant, and only serves to obscure the ties of interest.
jobdat.result <- matrix(c(
1, 0, 1, 0, 0,
1, 1, 0, 0, 0,
1, 0, 1, 0, 0,
0, 0, 0, 1, 0,
0, 0, 0, 0, 1
),
nrow = 5, ncol = 5, byrow = TRUE,
dimnames = list(c("job 1","job 2","job 3","job 5","job 7"),
c("job 1","job 2","job 3","job 5","job 7")))
in job.dat.result, the tie between job 3 and job 2 have been removed, both row-wise and col-wise, but the ties between these two jobs and the three jobs of interest are kept. Ideally, it should be possible to choose wether the diagonal of job 2 and job 3 should also be zero. But most likely, I'll set the diagonal to zero, for all jobs, so this is not required. But would be nice, if nothing else then for me to understand the logic of this at a higher level.
What I am trying to achieve, among other things, is circlegrams like this:
So simplicity in the number of ties is important. The diagram is reproduced like this:
library(circlize)
segmentcircle <- jobdat
diag(segmentcircle) <- 0
df.c <- get.data.frame(graph.adjacency(segmentcircle,weighted=TRUE))
colour <- brewer.pal(ncol(segmentcircle),"Set1")
chordDiagram(x = df.c,
grid.col = colour,
transparency = 0.2,
directional = 1, symmetric=FALSE,
direction.type = c("arrows", "diffHeight"), diffHeight = -0.065,
link.arr.type = "big.arrow",
# self.link=1
link.sort = TRUE, link.largest.ontop = TRUE,
link.border="black",
# link.lwd = 2,
# link.lty = 2
)