-3

I have records of data by account (say unique 400 records). Each record has three different indications indicated premium. For each record, I am concerned with how the indications compare to each other. In some cases, the indications may be all relatively in line, while in other the 3 indications will be volatile and very different. These records also have a state associated with them.

Anyways, I am wondering if there is a nice way to visualize the by record differences between the 3 indications. Also, whether or not there is a nice way to visualize the indication differences by state (perhaps on a map-like view in R??).

I have plotted the distributions of each individual indication using density plots which has been helpful, but here I am asking about a visualization of the differences between 1, 2, or all 3 indications for each record. Is what I am asking possible?

Thank you so much.

ActuaryGuy
  • 23
  • 1
  • 5
  • 1
    More people will be able to help you if you provide sample data. Further, are you interested in all possible differences (e.g. 1 vs. 2, 1 vs. 3, 2 vs. 3)? Does order matter? – JasonAizkalns Jun 10 '15 at 17:22
  • 1
    So what is your question? "..if there is a nice way to visualize..."? If so, the answer is - yes, there is a nice way. Please post example data I we'll be able to help you more. – pogibas Jun 10 '15 at 17:22
  • forgive me, but what is the best way to post sample data? Also, @JasonAizkalns, yes I am interested in all possible differences I believe (unless there is some other way to do it, correlation perhaps?) and order doesn't matter. Also, I think % differences between indications will be more helpful than nominal difference. – ActuaryGuy Jun 10 '15 at 19:05
  • @ActuaryGuy see my answer below for how to generate fake data or look into the [wakefield package](https://github.com/trinker/wakefield) – JasonAizkalns Jun 10 '15 at 19:20

1 Answers1

1

Perhaps something like this is what you're after, but this would be easier if you would provide sample data and be more descriptive in the exact question you are asking:

library(ggplot2)
library(dplyr)
library(tidyr)

df <- data.frame(id = 1:400,
                 state = state.abb, 
                 ind1 = rnorm(400),
                 ind2 = rnorm(400),
                 ind3 = rnorm(400))

df %>%
  mutate(diff_1_2 = ind1 - ind2,
         diff_1_3 = ind1 - ind3,
         diff_2_3 = ind2 - ind3) %>%
  gather(metric, value, -c(id, state)) %>%
  filter(metric %in% c("diff_1_2", "diff_1_3", "diff_2_3")) %>%
  ggplot(., aes(x = metric, y = value)) +
  geom_boxplot() +
  facet_wrap(~ state)
JasonAizkalns
  • 20,243
  • 8
  • 57
  • 116
  • I was able to modify this code to get what I needed, I am still pretty new to R, so this was a good learning experience for me, thank you very much! However, I do have an additional question, is it possible to now add to each of the state plots, the number of observations underlying each given plot? – ActuaryGuy Jun 12 '15 at 13:48