0

The following code does not work correctly inside knitr code chunk (it does not extract the desired substring):

What might be causing this behavior?

#Retrieve the earliest date
 earlydate <- min(time(month[1]), "2016-04-20")

 earlydate

 #Extract YYYY-MM from earliest date
 substr(earlydate, 1, 7)

where month[1] is

            TSLA.Open TSLA.High TSLA.Low TSLA.Close TSLA.Volume TSLA.Adjusted
  2016-02-16     158.7    162.95   154.11     155.17     5556300   155.17

#extracts the date:
time( month[1] ) 

   2016-02-16

Expect to see the following in knitr output (R Markdown):

## [1] 2016-02

Instead actual output is not extracted, just show original text:

## [1] 2016-02-16

However either of the following does works correctly (extracts YYYY-MM):

#inline r code
`r substr("2016-02-16", 1,7)`

outputs: ## [1] 2016-04

#knitr code chunk
```{r test, message=FALSE, warning=FALSE}

earlydate <- "2016-02-16"

earlydate

 #Extract YYYY-MM from earliest date
 substr(earlydate, 1, 7)

```

outputs: ## [1] 2016-04

Class Details When both are arguments are type Date

#Retrieve the earliest date
 earlydate <- min(time(month[1]), as.Date("2016-04-20"))

 class(earlydate)
 ## [1] "Date"

When one is type date and the other type "character"

#Retrieve the earliest date
 earlydate <- min(time(month[1]), "2016-04-20")

 class(earlydate)
 ## [1] "Date"

Additional Info

Environment

OS: Win7

RStudio Version 0.99.892

Rx64 3.2.4 (R version)

document type: shiny knitr doc (.Rmd)

Library: quantmod

Library: knitr

The substr function does not appear to extract the substring in a shiny knitr Rmd document.

Details are in the following stackflow link:

Rscript - knitr: substr function not working correctly inside knitr code chunk

Minimal Reproducible Example (this does not extract desired substring):

substr( min( as.Date(2013-03-14), "2016-04-20"), 1,7)
outputs: ## [1] "2013-03-14"

Expect (this works as expected):

 

substr( min( as.Date(2013-03-14), as.Date("2016-04-20")), 1,7)
 outputs (desired): ## [1] "2013-03"

It appears unrelated to knitr, since this behavior is also seen on the R console. Returned classes (as stated above) and data processing do not appear to correlate. It would appear to be an underlying R issue.

Is this WAD?

BR/KK

Community
  • 1
  • 1
KK.
  • 693
  • 6
  • 15
  • I am aware that classes are not the same, class(month[1]) is **"Date"** class("2016-04-20") is **"character"** however, class(earlydate) is **"Date"** But using **as.Date** on the character object **resolves the problem** observed. ** earlydate <- min(time(month[1]), as.Date("2016-04-20"))** Yields result: ** ## [1] "2016-02"** What could explain this behavior? – KK. Apr 28 '16 at 03:45
  • 1
    Please provide the data for `earlydate` via: `dput(earlydate)` – coatless Apr 28 '16 at 05:23
  • dput(earlydate) structure("16847", class = "Date") no difference for earlydate via dput if generated with mixed types (date, character) and or same type (Date, Date) in **min** function. The same value is returned by dput in both cases. – KK. Apr 28 '16 at 06:16

1 Answers1

2

First, to show what a minimal self-contained reproducible example means, this is all you need to demonstrate the problem:

x1 = as.Date('2013-03-14')
x2 = min(x1, '2016-04-20')
substr(x1, 1, 7)  # "2013-03"
substr(x2, 1, 7)  # "2013-03-14"

To investigate the issue, take a look at what these objects really are:

dput(x1, '')  # structure(15778, class = "Date")
dput(x2, '')  # structure("15778", class = "Date")

x2 is essentially a character string "15778" masked by a class Date. What does that mean? I do not know (dates are often represented as integers internally instead of characters). It is just a weird object returned by min(), when you asked for the minimum of a date and a character string (I do not know what that means, but R returns something anyway).

Why can this object be problematic? Take a look at the source code of substr():

> substr
function (x, start, stop) 
{
    if (!is.character(x)) 
        x <- as.character(x)
    .Internal(substr(x, as.integer(start), as.integer(stop)))
}

is.character(x2) is TRUE, so it is not coerced to character, then it is passed to an internal function (presumably further passed to a certain C function), and I'm not going to dig deeper, since the lesson should be clear now: do not compare apples with oranges. For example, if you want the minimum of two dates, make sure both values are really dates:

x2 = min(x1, as.Date('2016-04-20'))

Another possibility is to coerce data to a certain type explicitly, e.g. in this case, you want to do substr() on a character string, so make sure it is indeed character:

substr(as.character(x2), 1, 7)

Either way solves your original issue, but the first way is recommended.

Yihui Xie
  • 28,913
  • 23
  • 193
  • 419
  • Hi Yihui, thank you for looking into this issue. I was aware on how to workaround this issue, but not certain that underlying code (.internal()) is doing the right thing. While working through this issue, I came to understand indeed this was not a knitr issue, so my apologies, but an R issue. I would expect some sort warning or error for mixed types. Thanks again. – KK. Apr 28 '16 at 07:05