0

I'm new to R and the Forum, so let me know if you need any more Information in order to help me with the issue. Big Thanks in advance for any help!

I'm currently stuck with a problem, which in theory should be easy to solve with the .drop command, which for some reason though is not working for me.

I want to create an object, which includes the opposition's "yes" and "no" votes regarding a certain vote in parliament. Now when the frequency counts are >0, it includes frequencies for both "yes" and "no". However, if the opposition voted unitary on an issue (e.g 200 yes votes, 0 no votes), the 0 frequencies is dropped for some reason.

library(foreign)
library(tidyverse)
library(readstata13)
library(tibbletime)
library(lubridate)
library(car)


BTFULLOPPSUM <- BTFULLD %>% dplyr::filter(Opposition == 1) %>% dplyr::group_by(vote_id, vote_beh, .drop = FALSE) %>%
  dplyr::summarise(number = n())

#BTFULLOPPSUM is the new object. BTFULLD the dataframe

That's the result.

number
   vote_id  Y/N  number
1   9001    0   226
2   9002    0   227
3   9003    0   213
4   9004    0   16
5   9004    1   196

1 == Yes, 0 == NO

This is what I would like:

1 9001  0 226
2 9001  1 0 
3 9002  0 227
4 9002  1 0

vote_beh is the voting decesion, so either Yes (1) or no (0). I hope that's sufficient, because the issue already starts with the vote 9001, as non of the opposition parties voted with Yes. (Opposition parties in this vote were the CDU/CSU for example).

This is the dput of a small part of the important variables for the vote_id 9001. It looks much more confusing to me, but maybe you can work properly with that.

structure(list(vote_id = c(9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 9001, 
9001, 9001, 9001), vote_beh = c(0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 
0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 
0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 
1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 
0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 
0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 
1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 
1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 
0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 
1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 
1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 
0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 
0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 
1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 
1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 
1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 
0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 
1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 
0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 
1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 
0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 
1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 
1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0
), Opposition = c(1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 
0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 
1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 
1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 
0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 
0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 
1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 
0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 
1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 
1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 
0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 
1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 
1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 
0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 
1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 
1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 
0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 
1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 
1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 
1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 
1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 
0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 
0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -493L))
user12575032
  • 77
  • 1
  • 6
  • Can you add `dput(BTFULLD)` to make this post reproducible ? – Ronak Shah Dec 21 '19 at 12:02
  • I believe the dataset would be to big to put in here. It has some 100.000 observations. But I can post describtive statistics if you need any? – user12575032 Dec 21 '19 at 12:19
  • What is `class(BTFULLD$vote_id)` ? Can you try `BTFULLD %>% mutate(vote_id = factor(vote_id)) %>% filter(Opposition == 1) %>% group_by(vote_id, vote_beh, .drop = FALSE) %>% summarise(number = n())` – Ronak Shah Dec 21 '19 at 12:24
  • ``` class(BTFULLD$vote_id) [1] "numeric" ``` . I also tried the suggested code and although I don't get any warning message, nothing appears to be happening. If I posted a glimpse of the data set, would that help reproduce the example? – user12575032 Dec 21 '19 at 12:41
  • Yes, it would help to look at the data for one or two vote_id's that have your problem. If you can find some, you could share in the body of your question the output of `dput(BTFULLD %>% filter(vote_id %in% c(PROBLEM_VOTEID1, PROBLEMVOTEID2)) %>% head(10)` – Jon Spring Dec 21 '19 at 17:17

1 Answers1

1

tidyr::complete will create rows for every combination of the specified variables -- useful here so that you can get a Y + N row for every vote_id.

library(tidyverse)
BTFULLOPPSUM <- tibble::tribble(
            ~vote_id, ~Y_N, ~number,
                9001,    0,     226,
                9002,    0,     227,
                9003,    0,     213,
                9004,    0,      16,
                9004,    1,     196
            )

BTFULLOPPSUM %>%
  complete(vote_id, Y_N, fill = list(number = 0))

## A tibble: 8 x 3
#  vote_id   Y_N number
#    <dbl> <dbl>  <dbl>
#1    9001     0    226
#2    9001     1      0
#3    9002     0    227
#4    9002     1      0
#5    9003     0    213
#6    9003     1      0
#7    9004     0     16
#8    9004     1    196
Jon Spring
  • 55,165
  • 4
  • 35
  • 53
  • Hey, I tried to use the comand above, but didn't have any success with it. I now edited the Question and included a part of the dataframe. Hope that will make it easier to reproduce! Thank you for taking the time to help :) – user12575032 Dec 22 '19 at 08:36
  • Thanks for editing the question. 1) It will be most helpful if you can include the output of `dput(SOME_DATA)` -- this will allow people to load an identical object to yours. If you copy the way the data *prints* (as you have now), the encoding of your data is ambiguous, since it's possible for different data to print the same way. 2) It's helpful if you can explain what happened. "Didn't have any success" doesn't help me understand what you did, what the result was, or why that wasn't what you wanted. – Jon Spring Dec 22 '19 at 16:59