1

As of dplyr (0.8.3) and sf (0.8.0), the following was possible (see https://stackoverflow.com/a/49354480/9164265):

library(dplyr)
library(sf)

nc <- st_read(system.file("shape/nc.shp", package="sf"))
nc %>%
  group_by(SID74) %>%
  summarise(geometry = st_union(geometry)) %>%
  ungroup()

This would have had the effect of combining each geometry with the same SID74 into their own MULTIPOLYGONs.

However, this now (dplyr 1.0.0) gives the following error:

Error: Problem with `summarise()` input `geometry`.
x Input `geometry` must return compatible vectors across groups
ℹ Input `geometry` is `st_union(geometry)`.
ℹ Result type for group 1 (SID74 = 0): <sfc_MULTIPOLYGON>.
ℹ Result type for group 2 (SID74 = 1): <sfc_MULTIPOLYGON>.
Run `rlang::last_error()` to see where the error occurred.

Does anyone know why dplyr is throwing this error, despite the types evidently being of the same <sfc_MULTIPOLYGON> class? Thanks for any help!

jjoannes
  • 21
  • 1
  • 4
  • I can't reproduce your error. I'm using `dplyr 1.0.0` and I don't get your error message. – Martin Gal Jul 17 '20 at 16:52
  • Me neither with dplyr 1.0.0 and sf 0.9.5 – agila Jul 17 '20 at 19:17
  • There have been enormous changes with `sf`, primarily related to how Proj is handled, and things are now in Proj-6.3 - Proj-7+ land. I would imagine sf-0.8.0 was back in Proj-4. Over at [r-sig-geo](https://stat.ethz.ch/mailman/listinfo/r-sig-geo) they've been banging the drum to update or be left in the dust. You have found the dust. These were all necessary, breaking changes. Update, upgrade. It will be better. – Chris Jul 18 '20 at 00:49
  • I have upgraded... and it is better :) Thanks for pointing this out to me, while I thought to upgrade `dplyr` I didn't think to check whether my `sf` was still in date. Re reproducibility... I can only imagine this is due to the later versions of `sf` you are likely using. Thanks for your comments! – jjoannes Jul 20 '20 at 09:09
  • Write up what you did to resolve the problem as an answer, then after a little wait, you can accept your own answer. This is useful because this is the Q/A process, virtuous circle, and you gain rep besides. And welcome to SO. – Chris Jul 21 '20 at 00:13
  • Although my solution (upgrade) solved the problem I had, it is not exactly an answer to the question - I don't think the error above should have been raised. I'll add this as an answer all the same. Thanks! – jjoannes Jul 22 '20 at 12:25

3 Answers3

1

The error no longer appears when upgrading sf 0.8.0 --> 0.9.5. Although this does not explain the error (using dplyr 1.0.0 & sf 0.8.0) itself, it would make sense to upgrade all packages being used in conjunction with dplyr when the latter is being upgraded (especially by a major version as is the case here).

jjoannes
  • 21
  • 1
  • 4
0

I don't have a reproducible example for this (I'm under a deadline right now), but I'm getting a similar error when I run some code:

df <- df %>% group_by(case_id) %>% dplyr::mutate(status_official = last(na.omit(status)))

Where the data has from 1 to 30 rows per case_id. I think that the problem is that some of the cases don't have a non-missing value for status, while some so. When I filter out the rows with missing status values, I don't get an error.

Barry DeCicco
  • 251
  • 1
  • 7
0

I received a similar error when I was trying to group dates that were either a numeric or NA_Date_. I resolved the problem by using NA instead of NA_Date_.

It's worth going through your code to try and spot where similar inconsistencies might be creeping in.

Dharman
  • 30,962
  • 25
  • 85
  • 135
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-ask). – Community Sep 15 '21 at 15:21