2

I encountered a strange problem with R. I have a dataframe with several variables. I add a variable to this dataframe that contains an underscore, for example:

allres$tmp_weighted <- allres$day * allres$area

Before I do this, R tells me that the variable allres$tmp does not exist (which is right). However, after I add allres$tmp_weighted to the dataframe and call allres$tmp, I get the data for allres$tmp_weighted. It seems as if the part after the underscore does not matter at all for R. I tried it with several other variables / names and it always works that way

I don't think this should work like this? Am I overlooking something here? Below I pasted some code together with output from the Console.

# first check whether variable exists
allres_sw$Ndpsw

> NULL

#define new variable with underscore in variable name
allres_sw$Ndpsw_weighted <- allres_sw$Ndepswcrit * allres_sw$Area

#check again whether variable exists
allres_sw$Ndpsw

>   [1]    17.96480   217.50240    44.84415    42.14560     0.00000    43.14444    53.98650     9.81939     0.00000   110.67720

# this is the output that I would expect from "Ndpsw_weighted" - and indeed do get
allres_sw$Ndpsw_weighted
>   [1]    17.96480   217.50240    44.84415    42.14560     0.00000    43.14444    53.98650     9.81939     0.00000   110.67720
Ritchie Sacramento
  • 29,890
  • 4
  • 48
  • 56
Lena
  • 311
  • 2
  • 10
  • I have the same problem as your first problem - R is not reading the entire variable name it is stopping at an underscore and then complaining the variable doesn't exist (which is true). Did you resolve that? – Simon Woodward Jan 17 '22 at 00:57
  • 1
    Hi Simon, that sounds like a different problem - in my case it was never the issue that R did *not* recognise a variable, only that it *did* recognise a variable based on part of its name (which was behaviour that I didn't expect, but is to be expected when using the $ operator as explained by Will below.) – Lena Jan 18 '22 at 14:22

2 Answers2

2

Have a look at ?`[` or ?`$` in your R console. If you look at the name argument of the extract functions it states that names are partially matched when using the $ operator (as opposed to the `[[` operator, which uses exact matches based on the exact = TRUE argument).

From ?`$`

A literal character string or a name (possibly backtick quoted). For extraction, this is normally (see under ‘Environments’) partially matched to the names of the object.

Wil
  • 3,076
  • 2
  • 12
  • 31
  • Thanks Wil! This is really new to me. So it has (as Armali also points out) nothing to do with the underscore. – Lena May 02 '19 at 13:21
  • @Lena that is correct, it doesn't have anything to do with the underscore. – Wil May 02 '19 at 13:22
1

Just to expand somewhat on Wil's answer... From help('$'):

x$name

name
A literal character string or a name (possibly backtick quoted). For extraction, this is normally (see under ‘Environments’) partially matched to the names of the object.

x$name is equivalent to x[["name", exact = FALSE]]. Also, the partial matching behavior of [[ can be controlled using the exact argument.

exact
Controls possible partial matching of [[ when extracting by a character vector (for most objects, but see under ‘Environments’). The default is no partial matching. Value NA allows partial matching but issues a warning when it occurs. Value FALSE allows partial matching without any warning.

The key phrase here is partial match (see pmatch). You'll understand now that the underscore is nothing special - you can abbreviate allres_sw$Ndpsw_weighted to allres_sw$Ndp, provided no name is more similar than allres_sw$Ndepswcrit.

Armali
  • 18,255
  • 14
  • 57
  • 171
  • 1
    Thanks Armali for the elaborate answer. Indeed it has nothing to do with the underscore - I was looking in the wrong direction. I can see now that if I insert allres_sw[["Ndpsw", exact = TRUE]] I don't get the results for "Ndpsw_weighted" . I can see that the partial match makes coding faster (cause you don't have to type the whole variable name), but I would also think it makes coding more prone to errors (accidentally mis-typing a name might not lead to an error message because what you typed is the partial name of another variable). But I guess that's just the way it is. – Lena May 02 '19 at 13:25