I have a dataframe with factors. These factors have some levels. I could not find exact matches based on their names using regex.
df <- structure(list(age = structure(1:2, .Label = c("18-25",
">25"), class = "factor"), `M` = c("13.4",
"12.8"), 'N' = c("73", "76"), `SD` = c("6.8",
"6.6")), row.names = 51:52, class = "data.frame")
My df
age M N SD
51 18-25 13.4 73 6.8
52 >25 12.8 76 6.6
First try:
regexpr(pattern = "18-25", text= df, ignore.case = FALSE, perl = FALSE, fixed = T)
[1] -1 -1 -1 -1
attr(,"match.length")
[1] -1 -1 -1 -1
attr(,"index.type")
[1] "chars"
attr(,"useBytes")
[1] TRUE
Second Try
saved_level_name <- structure(list(V1 = structure(1L, .Label = "18-25", class = "factor")), row.names = c(NA,
-1L), class = "data.frame")
regexpr(pattern = saved_level_name, text= df, ignore.case = FALSE, perl = FALSE, fixed = T)
[1] 1 4 -1 -1
attr(,"match.length")
[1] 1 1 -1 -1
attr(,"index.type")
[1] "chars"
attr(,"useBytes")
[1] TRUE
Third Try (compare two outputs!)
saved_name_level_2 <- structure(list(V4 = structure(1L, .Label = ">25", class = "factor")), row.names = c(NA,
-1L), class = "data.frame")
regexpr(pattern = saved_level_name, text= df[1], ignore.case = FALSE, perl = FALSE, fixed = T)
regexpr(pattern = saved_name_level_2, text= df[1], ignore.case = FALSE, perl = FALSE, fixed = T)
[1] 1
attr(,"match.length")
[1] 1
attr(,"index.type")
[1] "chars"
attr(,"useBytes")
[1] TRUE
[1] 1
attr(,"match.length")
[1] 1
attr(,"index.type")
[1] "chars"
attr(,"useBytes")
[1] TRUE
Forth Try
regexpr(pattern = as.character( saved_name_level ), text= df, ignore.case = FALSE, perl = FALSE, fixed = T)
[1] -1 -1 -1 -1
attr(,"match.length")
[1] -1 -1 -1 -1
attr(,"index.type")
[1] "chars"
attr(,"useBytes")
[1] TRUE
First try : 0 results Second try : No meaning out of results (1, 4 ?) Third try : Same results with different inputs at face value. Forth Try : No results!
Possibly, regex finds the stored value of factors and not their face value/name?
How Can I use Regex to search factor names, and not their values?