1

I'm new to R, but have been following a pretty good guide to help along the way.

I've imported sales data for the past 36 months, used tidyr to move the Date columns to rows, and then cleaned the Date character string to a Month Date, Year.

My ask is this: I need to update this file every month and will have a new, rolling 36 month period. Is there an improvement to the function I created to automate the new periods (ex. Current 36 month period starts September 1, 2013, Next 36 month period starts October 1, 2013, etc.).

Any ideas on how to do this without having to cut and paste?

Below is my code

# import rolling periods
rolling <- read.csv("h:/R/BI with R/Rolling Periods.csv", header=T)

# remove last four columns
rolling <- rolling[,-c(42:45)]

# gather columns to rows with tidyr
require(tidyr)
rolling <- gather(rolling, "Date", "CSE", 6:41)

head(rolling)

# list from Date column
unique(rolling$Date)

# clean Date variable
clean = function(col) {
  col = gsub('X1.Month.9.1.2013.thru.9.30.2013..Case.Equivs', 'September 1, 2013', col, fixed = TRUE)
  col = gsub('X1.Month.10.1.2013.thru.10.31.2013..Case.Equivs', 'October 1, 2013', col, fixed = TRUE)
  col = gsub('X1.Month.11.1.2013.thru.11.30.2013..Case.Equivs', 'November 1, 2013', col, fixed = TRUE)
  col = gsub('X1.Month.12.1.2013.thru.12.31.2013..Case.Equivs', 'December 1, 2013', col, fixed = TRUE)
  col = gsub('X1.Month.1.1.2014.thru.1.31.2014..Case.Equivs', 'January 1, 2014', col, fixed = TRUE)
  col = gsub('X1.Month.2.1.2014.thru.2.28.2014..Case.Equivs', 'February 1, 2014', col, fixed = TRUE)
  col = gsub('X1.Month.3.1.2014.thru.3.31.2014..Case.Equivs', 'March 1, 2014', col, fixed = TRUE)
  col = gsub('X1.Month.4.1.2014.thru.4.30.2014..Case.Equivs', 'April 1, 2014', col, fixed = TRUE)
  col = gsub('X1.Month.5.1.2014.thru.5.31.2014..Case.Equivs', 'May 1, 2014', col, fixed = TRUE)
  col = gsub('X1.Month.6.1.2014.thru.6.30.2014..Case.Equivs', 'June 1, 2014', col, fixed = TRUE)
  col = gsub('X1.Month.7.1.2014.thru.7.31.2014..Case.Equivs', 'July 1, 2014', col, fixed = TRUE)
  col = gsub('X1.Month.8.1.2014.thru.8.31.2014..Case.Equivs', 'August 1, 2014', col, fixed = TRUE)
  col = gsub('X1.Month.9.1.2014.thru.9.30.2014..Case.Equivs', 'September 1, 2014', col, fixed = TRUE)
  col = gsub('X1.Month.10.1.2014.thru.10.31.2014..Case.Equivs', 'October 1, 2014', col, fixed = TRUE)
  col = gsub('X1.Month.11.1.2014.thru.11.30.2014..Case.Equivs', 'November 1, 2014', col, fixed = TRUE)
  col = gsub('X1.Month.12.1.2014.thru.12.31.2014..Case.Equivs', 'December 1, 2014', col, fixed = TRUE)
  col = gsub('X1.Month.1.1.2015.thru.1.31.2015..Case.Equivs', 'January 1, 2015', col, fixed = TRUE)
  col = gsub('X1.Month.2.1.2015.thru.2.28.2015..Case.Equivs', 'February 1, 2015', col, fixed = TRUE)
  col = gsub('X1.Month.3.1.2015.thru.3.31.2015..Case.Equivs', 'March 1, 2015', col, fixed = TRUE)
  col = gsub('X1.Month.4.1.2015.thru.4.30.2015..Case.Equivs', 'April 1, 2015', col, fixed = TRUE)
  col = gsub('X1.Month.5.1.2015.thru.5.31.2015..Case.Equivs', 'May 1, 2015', col, fixed = TRUE)
  col = gsub('X1.Month.6.1.2015.thru.6.30.2015..Case.Equivs', 'June 1, 2015', col, fixed = TRUE)
  col = gsub('X1.Month.7.1.2015.thru.7.31.2015..Case.Equivs', 'July 1, 2015', col, fixed = TRUE)
  col = gsub('X1.Month.8.1.2015.thru.8.31.2015..Case.Equivs', 'August 1, 2015', col, fixed = TRUE)
  col = gsub('X1.Month.9.1.2015.thru.9.30.2015..Case.Equivs', 'September 1, 2015', col, fixed = TRUE)
  col = gsub('X1.Month.10.1.2015.thru.10.31.2015..Case.Equivs', 'October 1, 2015', col, fixed = TRUE)
  col = gsub('X1.Month.11.1.2015.thru.11.30.2015..Case.Equivs', 'November 1, 2015', col, fixed = TRUE)
  col = gsub('X1.Month.12.1.2015.thru.12.31.2015..Case.Equivs', 'December 1, 2015', col, fixed = TRUE)
  col = gsub('X1.Month.1.1.2016.thru.1.31.2016..Case.Equivs', 'January 1, 2016', col, fixed = TRUE)
  col = gsub('X1.Month.2.1.2016.thru.2.28.2016..Case.Equivs', 'February 1, 2016', col, fixed = TRUE)
  col = gsub('X1.Month.3.1.2016.thru.3.31.2016..Case.Equivs', 'March 1, 2016', col, fixed = TRUE)
  col = gsub('X1.Month.4.1.2016.thru.4.30.2016..Case.Equivs', 'April 1, 2016', col, fixed = TRUE)
  col = gsub('X1.Month.5.1.2016.thru.5.31.2016..Case.Equivs', 'May 1, 2016', col, fixed = TRUE)
  col = gsub('X1.Month.6.1.2016.thru.6.30.2016..Case.Equivs', 'June 1, 2016', col, fixed = TRUE)
  col = gsub('X1.Month.7.1.2016.thru.7.31.2016..Case.Equivs', 'July 1, 2016', col, fixed = TRUE)
  col = gsub('X1.Month.8.1.2016.thru.8.31.2016..Case.Equivs', 'August 1, 2016', col, fixed = TRUE)
  return(col)
}

rolling$Date = clean(rolling$Date)
head(rolling)

# convert to Date object
rolling$Date = as.Date(rolling$Date, format="%B %d, %Y")
head(rolling)
Sherif B
  • 11
  • 3
  • Parse the date: `rolling$Date_clean <- as.Date(rolling$Date, 'X1.Month.%m.%d.%Y')`. You can then use `format(rolling$Date_clean, '%B%e, %Y')` if you _really_ need a character date. – alistaire Sep 30 '16 at 01:17

2 Answers2

1

It seems you are only looking at the first part of your rolling date to determine the representative date (the date you are replacing). In that case you could take advantage of backreferences in replacement when using gsub and regular expression

# Function might not be need now
clean  <- function(col) {
    col <- gsub("X1\\.Month\\.(\\d{1,2})\\.(\\d{1,2})\\.(\\d{4}).thru.*","\\1 \\2 \\3", col)
}

head(rolling)

# convert to Date object
#Note the new date fomat
rolling$Date <- as.Date(rolling$Date, format="%m %d %Y")
head(rolling)

If you also need the ending date of the rolling date, you could still use more backreferences to capture the end date.

Marcelo
  • 4,234
  • 1
  • 18
  • 18
1

Convert the column to a data frame using dot as the field separator and then format using sprintf and month.name like this:

clean2 <- function(x) {
  DF <- read.table(text = x, sep = ".")
  with(DF, sprintf("%s %d, %d", month.name[V3], V4, V5))
}

The last statement could alternately have been written as:

with(DF, format(as.Date(paste(V5, V3, V4, sep = "-")), "%B %e, %Y"))

example

For example, using the input in the Note below:

clean2(x)
## [1] "September 1, 2013" "October 1, 2013"   "November 1, 2013" 

Note: Here is some sample data in reproducible form:

x <- c("X1.Month.9.1.2013.thru.9.30.2013..Case.Equivs",  
       "X1.Month.10.1.2013.thru.10.31.2013..Case.Equivs", 
       "X1.Month.11.1.2013.thru.11.30.2013..Case.Equivs")
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341