-1

I want to read an R file or script, modify the name of the external data file being read and export the modified R code into a new R file or script. Other than the name of the data file being read (and the name of the new R file) I want the two R scripts to be identical.

I can come close, except that I cannot figure out how to retain the blank lines I use for readability and error reduction.

Here is the original R file being read. Note that some of the code in this file is non-sensical, but to me that is irrelevant. This code does not need to run.

# apple.pie.all.purpose.flour.arsc.Jun23.2013.r

library(my.library)

aa <- 10       # aa
bb <- c(1:7)   # bb

my.data = convert.txt("../applepieallpurposeflour.txt",

group.df = data.frame(recipe = 
           c("recipe1", "recipe2", "recipe3", "recipe4", "recipe5")),
             covariates = c(paste( "temp", seq_along(1:aa), sep="")))

ingredient <- c('all purpose flour')

function(make.pie){ make a pie }

Here is R code I use to read the above file, modify it and export the result. This R code runs and is the only code that needs to run to achieve the desired result (except that I cannot get the format of the new R script to match that of the original R script exactly, i.e., blank lines present in the original R script are not present in the new R script):

setwd('c:/users/mmiller21/simple r programs/')

# define new fruit

new.fruit <- 'peach'

# read flour file for original fruit

flour   <- readLines('apple.pie.all.purpose.flour.arsc.Jun23.2013.r')

# create new file name

output.flour <- paste(new.fruit, ".pie.all.purpose.flour.arsc.Jun23.2013.r", sep="")

# add new file name

flour.a <- gsub("# apple.pie.all.purpose.flour.arsc.Jun23.2013.r", 
                paste("# ", output.flour, sep=""), flour)

# add line to read new data file

cat(file = output.flour, 
           gsub( "my.data = convert.txt\\(\"../applepieallpurposeflour.txt", 
           paste("my.data = convert.txt\\(\"../", new.fruit, "pieallpurposeflour.txt", 
           sep=""), flour.a),
           sep=c("","\n"), fill = TRUE
)

Here is the resulting new R script:

# peach.pie.all.purpose.flour.arsc.Jun23.2013.r
library(my.library)
aa <- 10       # aa
bb <- c(1:7)   # bb

my.data = convert.txt("../peachpieallpurposeflour.txt",
           group.df = data.frame(recipe = 
           c("recipe1", "recipe2", "recipe3", "recipe4", "recipe5")),
           covariates = c(paste( "temp", seq_along(1:aa), sep="")))
ingredient <- c('all purpose flour')
function(make.pie){ make a pie }

There is one blank line in the newly-created R file, but how can I insert all of the blank lines present in the original R script? Thank you for any advice.

EDIT: I cannot seem to duplicate the blank lines here on StackOverflow. They seem to be deleted automatically. StackOverflow is even deleting the indentation I am using and I cannot seem to replace it. Sorry about this. Automatic deletion of blank lines and indentation is problematic when the issue at hand is specifically about formatting. I cannot seem to fix the post to display the R code as formatted in my script. However, the code does display correctly when I am actively editing the post.

EDIT: June 27, 2013: The deletion of empty rows and indentation in the code for the original R file and in the code for the middle R file appears to be associated with my laptop rather than with StackOverflow. When I view this post and my answers on my office desktop the format is correct. When I view this post and my answers with my laptop the empty rows and indentation are gone. Perhaps my laptop monitor is malfunctioning. Sorry about assuming initially that the problem was with StackOverflow.

Mark Miller
  • 12,483
  • 23
  • 78
  • 132
  • 2
    How do you plan to run your final scripts, the ones like `apple.pie.all.purpose.flour.arsc.Jun23.2013.r`? I am asking because there are much better ways to approach your problem. If you run the scripts from the command line, then you can make the input file an argument (see `commandArgs`). If you source your script with `source` from within R, then you could just wrap your code into a function that takes the filename as an argument. – flodel Jun 25 '13 at 01:19
  • 1
    As @flodel says, there are many better ways to handle this than text munging. I'd go the function route myself. – Hong Ooi Jun 25 '13 at 01:42
  • 1
    @flodel The R script will be submitted to a cluster or supercomputer. If I determine how to create the function you suggest I will post it here. Regardless of how the files are created, my goal is that both R files be identical except for the two differences I highlight above. – Mark Miller Jun 25 '13 at 06:11

2 Answers2

1

Here is a function that will create a new R file for every combination of two variables. Sorry the formatting of the code below is not better. The code does run and does work as intended (provided the name of the original R file ends in ".arsc.Jun26.2013.r" instead of in ".arsc.Jun23.2013.r" used in the original post):

setwd('c:/users/mmiller21/simple r programs/')

# define fruits of interest

fruits <- c('apple', 'pumpkin', 'pecan')

# define ingredients of interest

ingredients <- c('all.purpose.flour', 'sugar', 'ground.cinnamon')

# define every combination of fruit and ingredient

fruits.and.ingredients <- expand.grid(fruits, ingredients)

old.fruit <- as.character(rep('apple', nrow(fruits.and.ingredients)))
old.ingredient  <- as.character(rep('all.purpose.flour', nrow(fruits.and.ingredients)))

fruits.and.ingredients2 <- cbind(old.fruit , as.character(fruits.and.ingredients[,1]),
                           old.ingredient, as.character(fruits.and.ingredients[,2]))

colnames(fruits.and.ingredients2) <- c('old.fruit', 'new.fruit', 'old.ingredient', 'new.ingredient')


# begin function

make.pie <- function(old.fruit, new.fruit, old.ingredient, new.ingredient) {

new.ingredient2 <- gsub('\\.',  '', new.ingredient)
old.ingredient2 <- gsub('\\.',  '', old.ingredient)

new.ingredient3 <- gsub('\\.', ' ', new.ingredient)
old.ingredient3 <- gsub('\\.', ' ', old.ingredient)

# file name

old.file <- paste(old.fruit, ".pie.", old.ingredient, ".arsc.Jun26.2013.r", sep="")
new.file <- paste(new.fruit, ".pie.", new.ingredient, ".arsc.Jun26.2013.r", sep="")

# read original fruit and original ingredient

flour   <- readLines(old.file)

# add new file name

flour.a <- gsub(paste("# ", old.file, sep=""), 
                paste("# ", new.file, sep=""), flour)

# read new data file 

old.data.file <- print(paste("my.data = convert.txt(\"../", old.fruit, "pie", old.ingredient2, ".txt\",", sep=""), quote=FALSE)

new.data.file <- print(paste("my.data = convert.txt(\"../", new.fruit, "pie", new.ingredient2, ".txt\",", sep=""), quote=FALSE)

flour.b <- ifelse(flour.a == old.data.file, new.data.file, flour.a)

flour.c <- ifelse(flour.b == paste('ingredient <- c(\'', old.ingredient3, '\')', sep=""), 
                             paste('ingredient <- c(\'', new.ingredient3, '\')', sep=""), flour.b)

cat(flour.c, file = new.file, sep=c("\n"))

}

apply(fruits.and.ingredients2, 1, function(x) make.pie(x[1], x[2], x[3], x[4]))
Mark Miller
  • 12,483
  • 23
  • 78
  • 132
0

Here is one solution that reproduces the original R script (except for the two desired changes) while also preserving the formatting of that original R script in the new R script.

setwd('c:/users/mmiller21/simple r programs/')

new.fruit <- 'peach'

flour   <- readLines('apple.pie.all.purpose.flour.arsc.Jun23.2013.r')

output.flour <- paste(new.fruit, ".pie.all.purpose.flour.arsc.Jun23.2013.r", sep="")

flour.a <- gsub("# apple.pie.all.purpose.flour.arsc.Jun23.2013.r", 
                paste("# ", output.flour, sep=""), flour)

flour.b <- gsub( "my.data = convert.txt\\(\"../applepieallpurposeflour.txt", 
paste("my.data = convert.txt\\(\"../", new.fruit, "pieallpurposeflour.txt", sep=""), flour.a)

for(i in 1:length(flour.b)) {

if(i == 1) cat(flour.b[i], file = output.flour, sep=c("\n"), fill=TRUE               )
if(i >  1) cat(flour.b[i], file = output.flour, sep=c("\n"), fill=TRUE, append = TRUE)

}

Again, I apologize for my inability to format the above R code in a readable way. I have never encountered this problem on StackOverflow and do not know the solution. Regardless, the above R script solves the problem I described in the original post.

To see the formatting of the original R script you will have to click the edit button under the original post.

EDIT: June 25, 2013

I do not know what I was doing differently yesterday, but today I found that the following simple cat statement, in place of the for-loop immediately above, creates the new R script while preserving the formatting of the original R script.

cat(flour.b, file = output.flour, sep=c("\n"))
Mark Miller
  • 12,483
  • 23
  • 78
  • 132