0

If you run the R script below, the dataset "patients" is an eventlog of patients visiting a clinic and getting treatment. The trace explorer gets created as in the snapshot below with the tooltip displayed. Now in the "#Script for Frequency Percentage", you get the frequency percentage for each trace in the column "af_percent". My requirement is that, I just want to replace the "label = value" in the ggplot command below with corresponding frequency percentage of each trace. Please help.

library(splitstackshape)
library(scales)
library(ggplot2)
library(plotly)
tr <- data.frame(traces(patients, output_traces = T, output_cases = F))
tr.df <- cSplit(tr, "trace", ",")
tr.df$af_percent <-
percent(tr.df$absolute_frequency/sum(tr.df$absolute_frequency))
pos <- c(1,4:ncol(tr.df))
tr.df <- tr.df[,..pos]
tr.df <- melt(tr.df, id.vars = c("trace_id","af_percent"))
mp1 = ggplot(data = tr.df, aes(x = variable,y = trace_id, fill = value, 
label = value)) + 
geom_tile(colour = "white") + 
geom_text(colour = "white", fontface = "bold", size = 2) +
scale_fill_discrete(na.value="transparent") +
theme(legend.position="none")
ggplotly(mp1)

#Script for Frequency Percentage
tr = traces(patients, output_traces = T,output_cases = F) 
tr$af_percent = percent(te$absolute_frequency/sum(te$absolute_frequency))

enter image description here

Ashmin Kaul
  • 860
  • 2
  • 12
  • 37

1 Answers1

2
library(bupaR)
library(plotly) 

traces(patients, output_traces = T, output_cases = F)

p <- trace_explorer(patients,type = "frequent", coverage = 1) +
theme(axis.text.x=element_blank(),
      axis.ticks.x=element_blank(),
      axis.text.y=element_blank(),
      axis.ticks.y=element_blank())

gg <- ggplotly(p)

# Now manipulate the plotly object in the same way that 
# we would manipulate any other plotly object
# See https://plotly-book.cpsievert.me/extending-ggplotly.html
layout(gg, margin=list(l=50, b=50), legend=list(x=1.05))

enter image description here

Marco Sandri
  • 23,289
  • 7
  • 54
  • 58
  • @AshminKaul If you change your question, a reader will find my answer incomprehensible and puzzling. The right way is to add text with more details below the first question. – Marco Sandri Nov 14 '17 at 10:28
  • Sure, I'll ensure that, kindly help. – Ashmin Kaul Nov 14 '17 at 10:30
  • The first package it is, upvoted your answer Marco, please help. Also, I want to make the code scalable for larger data, if you can help me with removing hard coding, it would be great help. – Ashmin Kaul Nov 14 '17 at 10:41
  • Added the scales package, also the 7th and 8th line of code is hard coded, I request you to help me make the solution scalable and dynamic – Ashmin Kaul Nov 14 '17 at 10:48
  • tr.df <- tr.df[,c(1,4:9)] tr.df <- melt(tr.df, id.vars = "trace_id") These are hardcoded values, I wish to make them dynamic such that when a trace comes up with more activities, I don't have to select the columns, my final query is with the tooptip of course. – Ashmin Kaul Nov 14 '17 at 10:56
  • Ok, there is a tooltip in the snapshot above for your reference, I need the first three labels and finally the frequency % of each trace which I find using the last two statements in the script above. – Ashmin Kaul Nov 14 '17 at 11:08
  • You know the best Sir, Whatever you suggest. – Ashmin Kaul Nov 14 '17 at 11:15
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/158951/discussion-between-ashmin-kaul-and-marco-sandri). – Ashmin Kaul Nov 14 '17 at 11:25