Good Day All,
I'd like to visualize my dataset but I am struggling even naming the type of visualization I need !
I want to look at the overlapping sets between a Reference Standard and three new tests.
The Reference Standard has a binary outcome (R and S).
Each of the three new tests can have more than two outcomes (R, S, Fail, Indeterminate)
So a portion of my data look like this (as an R data frame):
Subject <- c("11-0001","11-0002","11-0003","11-0004","11-0005","11-0007","11-0008","11-0010","11-0011","11-0012","11-0013","11-0014","11-0015","11-0016","11-0017","11-0018","11-0019","11-0020","11-0021","11-0022","11-0023","11-0025","11-0027","11-0029","11-0030","11-0035","11-0036","11-0037","11-0038","11-0039","11-0040","11-0041","11-0043","11-0044","11-0045","11-0046","11-0047","11-0048","11-0050","11-0052","11-0053","11-0054","11-0055","11-0056","11-0058","11-0059","11-0061","11-0062","11-0063","11-0064","11-0065","11-0066","11-0068","11-0069","11-0070","11-0071","11-0072","11-0074","11-0075")
ReferenceStandard <- c("R","R","R","R","R","R","R","R","R","R","R","R","R","R","R","R","R","R","R","R","R","R","R","R","R","R","R","S","R","R","R","R","R","R","R","R","R","R","R","R","R","S","R","R","S","R","R","R","R","S","R","R","R","R","S","R","S","R","S")
TestA<- c("R","R","R","R","R","R","S","I","R","R","R","I","R","R","R","R","I","R","R","R","R","R","R","R","R","R","S","S","R","R","R","R","R","R","R","R","R","R","R","R","R","S","I","R","I","R","R","I","R","S","R","R","R","R","S","I","S","R","S")
TestB <- c("R","R","R","R","R","R","S","I","R","R","R","I","R","R","R","R","R","R","R","R","R","R","R","R","R","R","R","I","R","R","R","R","R","R","R","R","R","R","R","R","R","S","R","R","S","R","R","R","R","I","R","R","R","R","S","I","S","R","S")
TestC <-c("R","R","R","R","R","R","R","R","R","R","R","ND","R","R","R","R","R","R","R","R","R","R","R","R","R","R","S","S","R","R","R","R","R","R","R","R","R","R","R","R","R","S","R","R","S","R","R","R","R","S","R","R","R","R","S","ND","S","R","S")
mydata <- data.frame(subject=subject, ReferenceStandard=ReferenceStandard, TestA=TestA, TestB=TestB, TestC=TestC)
and so on (I have 1000 subjects) ...
So while the sensitivity/specificity all the individual tests against the Reference Standard are very similar, there are significant differences using Cochran's and McNemar's.
Right now, my hypothesis is that each test is failing differently. So TestA might fail on this set of subjects while TestB fails on a different set of subjects. In aggregate the numbers are similar enough so sensitivity/specificity are pretty similar but the paired sample statistical test highlight this is not the case. So I want to inspect this visually.
However, I am really stuck on what to even call this (because the new tests have four categories).
I have looked into Euler Diagrams but I do not believe that can support what I need.
I have thought that what I can do is make two sets of Euler Diagrams.
- From the perspective of Reference=R. So overlap of Ref & TestA are only Rs and the non-overlap between Ref & TestA are the Reference=R and the TestA != R.
- Repeating the above from the perspective of Reference=S.
I have also thought about an odd heatmap where the Y-axis is all 1000 subjects and the X-axis is ordered just like my data above but the four columns each color coded. Depending on how I sort the Y-axis, I can show off different aspects of the data. The problem is that it is really hard to pick out patterns with that kind of graphic.
Any other ideas ? Links to other visualizations would be really appreciated !