0

I have the D Data.frame that has values for Stations. I do have another data.frame F that has Stations threshold values for certain conditions (Ie., WBN, AN etc). I would like to use the value of each Stations in D to see under which conditions it falls in F and save the D data.frame with an added column for Conditions in DF. Here is my sample code

library(tidyverse)

set.seed(123)
D <- data.frame(Stations = c("A","S","D","Z"), Value = c(15,10,25,16))
F <- data.frame(Stations = c("A","S","D","Z"), WBN = runif(4, 0, 3), BN = runif(4, 3,6),
                N = runif(4,7,10), AN = runif(4,15,25))

Desired output I should get a data.frame like below

DF <- D %>% mutate(Condition = c("AN", "AN", "AN", "AN" ))

Sample from my actual data.frame data.frame Freq can be considered as F in the example above while Max_2020 can be considered as the D.

> head(Freq, 5)
# A tibble: 5 x 9
  Stations `1:2` `1:5`              `1:10` `1:25` `1:50` `1:100`            `1:200`            `1:500`
  <chr>    <chr> <chr>              <chr>  <chr>  <chr>  <chr>              <chr>              <chr>  
1 05EF001  911   1550               2180   3350   4410   5860               7730               11100  
2 05EG004  2.7   6.6                11     18     24     33                 43                 60     
3 05FF003  7.24  35.409999999999997 63.26  100.17 124.95 145.86000000000001 162.72999999999999 179.29 
4 05GC006  12.44 38.85              63.11  98.12  125.51 152.86000000000001 179.63             213.48 
5 05GD002  1.38  4.03               6.33   9.65   12.33  15.14              20.170000000000002 29.56  

> head(Max_2020, 5)
# A tibble: 5 x 2
  Stations Value
  <chr>    <dbl>
1 05EG004   9.18
2 05FF003  38.7 
3 05GC006  15.8 
4 05GD002   2.65
5 05GF002  57.2 

My goal is to add another column in Max_2020 and fill it out based on the condition that where the Value lies in Freq. in other words, for example, if the Value in Max_2020 is less than the number under column 1:2 in Freq, i would assign 1:2 to the new column in Max_2020 for a particular station. The Value in Max_2020 needs to be checked across all the columns in Freq for a particular stations and then assigned its respective condition.

Hydro
  • 1,057
  • 12
  • 25
  • Could you double check your desired output? Should `Condition` where `Stations == D` eq `NA`? – CPak Jul 22 '20 at 03:05
  • Wow- your code is gone but i am getting very strange results using your code on actual data. its spitting out `NA` for `Stations` for which i do have value and `Conditions`. Also the `Conditions` is wrongly assigned to `Stations` based on `Value` in `D`. – Hydro Jul 22 '20 at 03:26

1 Answers1

1

You can get the F in long format , joing it with D by Stations and for each Stations keep the rows where the Value crosses.

library(dplyr)

Freq %>%
  pivot_longer(cols = -Stations) %>%
  mutate(value = as.numeric(value)) %>%
  left_join(Max_2020, by = 'Stations') %>%
  group_by(Stations) %>% 
  slice(which(value > Value)[1] - 1) %>%
  select(Stations, name) %>%
  right_join(Max_2020, by = 'Stations')
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • I am getting `NA` for Variable `D` instead a particular condition.Also, when i applied your code on actual data, i see that each `Stations` is repeated for the number of names in `F`, i would rather have one condition based on `D` `Value` – Hydro Jul 22 '20 at 03:42
  • Which value do you want to consider for each `Stations`. I am selecting the condition (`value > Value`). For `D` there is no value in `F` which is greater than value in `D`, hence it returns `NA`. – Ronak Shah Jul 22 '20 at 03:49
  • Hello @Ronak, I just updated the question with output of my actual data.frame. Not sure, how to clearly describe my point here :(. – Hydro Jul 22 '20 at 04:05
  • First thing is you have columns of class character in `Freq`, so any kind of mathematical operation would not work on it. We need to convert it to numeric. What would be expected output for Station `05EG004` ? – Ronak Shah Jul 22 '20 at 04:14
  • Station `05EG004` should have output as `1:5` as its Value is `9.18` and is lower than `1:10` but greater than `1:2` – Hydro Jul 22 '20 at 04:28
  • Yes, but for the data showed what column name do you expect as output for `05EG004` ? – Ronak Shah Jul 22 '20 at 04:30
  • 1:5 for the station you mentioned – Hydro Jul 22 '20 at 04:42
  • @Hydro Check updated answer. – Ronak Shah Jul 22 '20 at 04:50