1

I am trying to filter a dataframe using dplyr but my code is failing to filter a particular value. The values are all numeric, and forcing numeric doesn't fix the issue. However, if I alter my code to filter a value enclosed in quotes, the code works. I don't know why filtering is failing.

I'm simplified the data below. It is not so simple to change the code to filter within quotes, so I am looking for an explanation on how to alter the column data so that a numeric value can be filtered successfully.

Thanks!

Unique_ID Minutes AvgOD_TechReps
JVY302_SDC_Rep3 240 0.257
JVY302_SDC_Rep3 250 0.269

I have simplified th

#Confirm column contains numeric values
> str(df$Minutes)
 num [1:2] 240 250

#Filter using values enclosed in quotes works
df2 <- df %>%
  dplyr::filter(Minutes == "240" | Minutes == "250") 

> df2
# A tibble: 2 × 3
  Unique_ID       Minutes AvgOD_TechReps
  <chr>             <dbl>          <dbl>
1 JVY302_SDC_Rep3     240          0.265
2 JVY302_SDC_Rep3     250          0.276

#Filter using numeric values fails for the 250 value
df3 <- df %>%
  dplyr::filter(Minutes == 240 | Minutes == 250) 

> df3
# A tibble: 1 × 3
  Unique_ID       Minutes AvgOD_TechReps
  <chr>             <dbl>          <dbl>
1 JVY302_SDC_Rep3     240          0.265

#Forcing the column to be number does not solve the problem
df4 <- df %>%
  dplyr::mutate_at("Minutes", as.numeric) %>%
  dplyr::filter(Minutes == 240 | Minutes == 250)

> df4
# A tibble: 1 × 3
  Unique_ID       Minutes AvgOD_TechReps
  <chr>             <dbl>          <dbl>
1 JVY302_SDC_Rep3     240          0.265
JVGen
  • 401
  • 3
  • 10
  • 4
    Please check if there is precision involved as it is double – akrun Jun 05 '23 at 15:55
  • 2
    Maybe try `dplyr::mutate_at("Minutes", as.integer)` (instead of `as.numeric`)cto convert to integer to solve precision issues – Gregor Thomas Jun 05 '23 at 15:56
  • Gregor this worked - if you post as an answer I can give you credit and signal that the thread is closed. I didn't expect precision issues, because the values in that column are calculated that same way, but alas, that was the issue. – JVGen Jun 05 '23 at 16:25
  • I just marked as duplicate of the R-FAQ we have for this issue. – Gregor Thomas Jun 05 '23 at 19:05

0 Answers0