0

Imagine there are two columns, one for p-value and the other representing slope. I want to find a way to plot only the slope data points that have a significant p-value. Here is my code:

print("State the file name (include .csv)")
filename <- readline()
file <- read.csv(filename)

print ("Only include trials with p-value < .05? (enter yes or no)")
pval_filter <- readline()
if (pval_filter == "yes"){
   i <- 0
   count <- 0
   filtered <- NULL
   while (i > length(file$pval)){
      if (file$pval[i] < .05){
         filtered[count] <- i
         count <- count + 1
      }
      i <- i + 1
   }

   x <- 0
   while (x != -1){
      print("State the variable to be plotted")
      temp_var <- readline()
      counter <- 0
      var <- NULL
      while (counter > length(filtered)){
         var[counter] = file [, temp_var][filtered[counter]]
         counter <- counter + 1
         }

      print ("State the title of the histogram")
      title <- readline()
      hist(var, main = title, xlab = var)
      print("Enter -1 to exit or any other number to plot another variable")
      x <- readline()
    }
}
Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
kevin ko
  • 115
  • 9

4 Answers4

4

Isn't this much shorter and produces roughly the same:

df = read.csv('file.csv')
df = df[df$pval < 0.05,]
hist(df$value)

This should at least get you started.

Some remarks regarding the code:

  • You use a lot of reserved names (var, file) as an object name, that is a bad idea.
  • If you want the program to work with user input, you need to check it before doing anything with it.
  • There is no need to explicitly loop over rows in a data.frame, R is vectorized (e.g. see how I subsetted df above). This style looks like Fortran, there is no need for it in R.
Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
2

It is hard to tell exactly what you want. It is best if an example is reproducible (we can copy/paste and run, we don't have your data so that does not work) and is minimal (there is a lot in your code that I don't think deals with your question).

But some pointers that may help.

First, the readline function has a prompt argument that will give you better looking interaction than the print statements.

If all your data is in a data frame with columns p and b for p-value and slope then you can include only the b values for which p<=0.05 with simple subsetting like:

hist( mydataframe$b[ mydataframe$p <= 0.05 ] )

or

with( mydataframe, hist(b[p<=0.05]) )

Is that enough to answer your question?

Greg Snow
  • 48,497
  • 6
  • 83
  • 110
1

Given that data = cbind(slopes, pvalues) (so col(data) == 2)

Like this:

plot(data[data[ ,2] < 0.05 , ])

Explanation:

data[ ,2] < 0.05 will return a vector of TRUE/FALSE with the length of the columns.

so then you will get:

data[c(TRUE, FALSE....), ]  

From there on, only the data will be selected where it says TRUE.

You will thus plot only those x's and y's where the pvalue is lower than 0.05.

PascalVKooten
  • 20,643
  • 17
  • 103
  • 160
0

Here is the code to plot only the slope data points with significant p-value: Assuming the column names of the file will be pval and slope.

# Prompt a message on the Terminal
filename <- readline("Enter the file name that have p-value and slopes (include .csv)")
# Read the filename from the terminal
file     <- read.csv(filename, header = TRUE)

# Prompt a message again on the Terminal and read the acceptance from user
pval_filter <- readline("Only include trials with p-value < .05? (enter yes or no)")    

if (to-lower(pval_filter) == "yes"){
   # Create a filtered file that contain only rows with the p-val less than that of siginificatn p-val 0.05
   file.filtered <- file[file$pval < 0.05, ]    

   # Get the title of the Histogram to be drawn for the slopes (filtered)
   hist.title <- readline("State the title of the histogram")
   # Draw histogram for the slopes with the title
   #     las = 2 parameter in the histogram below makes the slopes to be written in parpendicular to the X-axis
   #     so that, the labels will not be overlapped, easily readable. 
   hist(file.filtered$slope, main = hist.title, xlab = Slope, ylab = frequency, las = 2)
}

Hope this would help.

Kumar
  • 314
  • 3
  • 5
  • 16