0

I am creating convergence plots for an RDS dataset in R and would like to label these plots. Right now, my x-axis is "# of observations" and my y-axis is the RDS estimates, and the plot itself is labeled "Convergence plot for clientcondom=1". Is there a way to change this? See code below:

convergence.plot(site[[1]], 'clientcondom', est.func=RDS.I.estimates) convergence.plot(site[[2]], 'clientcondom', est.func=RDS.I.estimates) convergence.plot(site[[3]], 'clientcondom', est.func=RDS.I.estimates)

Also, is there a way to combine these plots into a single plot--I have three sites here, it would be nice to combine them and look at these side by side. Thank you very much for your responses!

Sumatra

sumatraed
  • 1
  • 1
  • Where does `convergence.plot` come from? – Heroka Nov 04 '15 at 09:58
  • @Heroka, as far as I understand it, the convergence plot is from the RDS data frame, the library is RDS. I am very new to R so figuring these out myself--any suggestions would be appreciated, thank you! – sumatraed Nov 04 '15 at 10:53
  • I did some rooting around for you in the source-code of convergence.plot, but I don't see any solution that doesn't involve modifying the code yourself, or re-using parts of it to create your own plots... – Heroka Nov 04 '15 at 11:08

1 Answers1

0

Let's start with your second question: If your data is in a single rds.data.frame, you can easily plot your three sites on the same chart.

First, lets create a list like you have

set.seed(1)
site1_df <-as.rds.data.frame(data.frame(id=1:10,recruiter.id=c("seed",5,7,3,8,2,10,9,1,6),network.size.variable=c(5,4,8,9,1,2,6,7,10,3),site1=as.factor(sample(c("blue", "red"), 10, replace = TRUE))))
site2_df <-as.rds.data.frame(data.frame(id=1:10,recruiter.id=c("seed",5,7,3,8,2,10,9,1,6),network.size.variable=c(5,4,8,9,1,2,6,7,10,3),site2=as.factor(sample(c("blue", "red"), 10, replace = TRUE))))
site3_df <-as.rds.data.frame(data.frame(id=1:10,recruiter.id=c("seed",5,7,3,8,2,10,9,1,6),network.size.variable=c(5,4,8,9,1,2,6,7,10,3),site3=as.factor(sample(c("blue", "red"), 10, replace = TRUE))))
sites_list <-list(site1_df,site2_df,site3_df)

Then reduce the list to a single rds.data.frame:

sites_df <-Reduce(function(...) merge(..., all=T), sites_list)
sites_rds <-as.rds.data.frame(sites_df)

You can then plot all sites in the same chart:

convergence.plot(sites_rds,c("site1","site2","site3"), est.func=RDS.I.estimates)

enter image description here

For your first question, you have to create your own convergence.plot2 function based on convergence.plot. The convergence.plot2 function below puts "Y-axis Estimate" for the ylab, "X-Axis # of Observations" for the xlab and "Your title Convergence plot of (site1,2,3)" for your title. Change those to your liking. Make sure to change all occurences.

convergence.plot2 <-function (rds.data, outcome.variable, est.func = RDS.II.estimates,
    as.factor = FALSE, ...)
{
    if (as.factor) {
        for (o in outcome.variable) {
            rds.data[[o]] <- as.factor(rds.data[[o]])
        }
    }
    f <- function(v) cumulative.estimate(rds.data, v, est.func,
        ...)
    ests <- lapply(outcome.variable, f)
    make.plot <- function(i) {
        Var1 <- Var2 <- value <- NULL
        e <- ests[[i]]
        nm <- outcome.variable[i]
        if (ncol(e) == 2) {
            e1 <- e[, 2, drop = FALSE]
            attr(e1, "n") <- attr(e, "n")
            e <- e1
            nm <- paste0(outcome.variable[i], "=", colnames(e)[1])
            rds.data[[outcome.variable[i]]] <- as.factor(rds.data[[outcome.variable[i]]])
        }
        if (ncol(e) > 1) {
            rownames(e) <- attr(e, "n")
            dat <- melt(e)
            datl <- melt(e[nrow(e), , drop = FALSE])
            p <- ggplot(dat) + geom_line(aes(x = Var1, color = as.factor(Var2),
                y = value)) + scale_color_hue(nm) + ylab("Y-axis Estimate") +
                xlab("X-Axis # of Observations") + scale_y_continuous(limits = c(0,
                1)) + theme_bw()
            p <- p + geom_hline(data = datl, aes(yintercept = value,
                color = as.factor(Var2)), linetype = 2, alpha = 0.5)
            p
        }
        else {
            dat <- data.frame(value = e[, 1], Var1 = attr(e,
                "n"))
            datl <- dat[nrow(dat), , drop = FALSE]
            v <- rds.data[[outcome.variable[i]]]
            rng <- if (!is.numeric(v))
                c(0, 1)
            else range(v, na.rm = TRUE)
            p <- ggplot(dat) + geom_line(aes(x = Var1, y = value)) +
                ylab(paste("Estimated", nm)) + xlab("# of Observations") +
                scale_y_continuous(limits = rng) + theme_bw()
            p <- p + geom_hline(data = datl, aes(yintercept = value),
                linetype = 2, alpha = 0.5)
            p
        }
        return(p + ggtitle(paste("Your title Convergence plot of", nm)))
    }
    plots <- lapply(1:length(outcome.variable), make.plot)
    do.call(.grid.arrange_RDS, plots)
}

It is then important to add this new function to the RDS environment. RDS uses functions that can only be found in it's own environment.

environment(convergence.plot2) <- asNamespace('RDS')

Calling convergence.plot2:

convergence.plot2(sites_rds,c("site1","site2","site3"), est.func=RDS.I.estimates)

enter image description here

Pierre Lapointe
  • 16,017
  • 2
  • 43
  • 56