Referencing multiple columns and rows to calculate new value in a new column

Question

Here is my data.frame {sf}. I converted this to longform, so the UID represents the polygon. In each UID there is anywhere from 0 to 3 species present. Sum percent for each UID will be 100 or 0.

UID	ORDER	SPECIES	PERCENT	geometry
1	1	A	80	blahblah
1	2	BL	15	blahblah
1	3	P	5	blahblah
2	1	S	75	blahblah
2	2	E	25	blahblah
2	3	na	0	blahblah
3	1	PA	61	blahblah
3	2	AT	39	blahblah
3	3	na	0	blahblah
4	1	na	0	blahblah
4	2	na	0	blahblah
4	3	na	0	blahblah

I want to create a new entry in a new column "ZONE" with these values:

If each unique UID does not contain A or AT for species, then enter "Misc." into ZONE.
If all species values for a unique UID is na, then enter na into Zone.
If a UID contains A or AT and the percent in the next cell of A or AT is >= 80, then enter "For". If <80 enter "Gra"

This was my attempt:

list < - c("A", "AT")
data.frame$ZONE <- (data.frame$SPECIES %in% list)
data.frame$ZONE[data.frame$ZONE == TRUE] <- "For"
data.frame$ZONE[is.na(data.frame$SPECIES)] <- NA
data.frame$ZONE[data.frame$ZONE == FALSE] <- "Misc."

This resulted in every row being treated as an individual instead of being grouped by IUD. Also, I completely disregarded the >=80 or <.

I want it to look like this:

UID	ORDER	SPECIES	PERCENT	ZONE	geometry
1	1	A	80	For	blahblah
1	2	BL	15	For	blahblah
1	3	P	5	For	blahblah
2	1	S	75	Misc.	blahblah
2	2	E	25	Misc.	blahblah
2	3	na	0	Misc.	blahblah
3	1	PA	61	Gra	blahblah
3	2	AT	39	Gra	blahblah
3	3	na	0	Gra	blahblah
4	1	na	0	na	blahblah
4	2	na	0	na	blahblah
4	3	na	0	na	blahblah

Thanks for the help.

score 1 · Accepted Answer · answered Jul 06 '23 at 01:01

Please try the below code

sf %>% group_by(UID) %>% 
  mutate(zone1=ifelse(any(SPECIES %in% c('A','AT')), 'Misc.', NA),
    zone2=ifelse(all(SPECIES %in% c('na')), 'na', NA),
    zone3=ifelse(any(SPECIES %in% c('A','AT') & PERCENT>=80), 'For',NA),
    zone4=ifelse(any(SPECIES %in% c('A','AT') & PERCENT<80), 'Gra',NA),
    zone=ifelse(is.na(coalesce(zone3,zone4,zone2,zone1)),'Misc.',coalesce(zone3,zone4,zone2,zone1))) %>% 
  select(-c(zone1:zone4))

^{Created on 2023-07-05 with reprex v2.0.2}

# A tibble: 12 × 6
# Groups:   UID [4]
     UID ORDER SPECIES PERCENT geometry zone 
   <dbl> <dbl> <chr>     <dbl> <chr>    <chr>
 1     1     1 A            80 blahblah For  
 2     1     2 BL           15 blahblah For  
 3     1     3 P             5 blahblah For  
 4     2     1 S            75 blahblah Misc.
 5     2     2 E            25 blahblah Misc.
 6     2     3 na            0 blahblah Misc.
 7     3     1 PA           61 blahblah Gra  
 8     3     2 AT           39 blahblah Gra  
 9     3     3 na            0 blahblah Gra  
10     4     1 na            0 blahblah na   
11     4     2 na            0 blahblah na   
12     4     3 na            0 blahblah na

Thank you, this is really good guidance for a much larger and more complicated dataset — Crippycajes, Jul 06 '23 at 05:30

Referencing multiple columns and rows to calculate new value in a new column

1 Answers1