2

I am using the get_friends function of rtweet package to get the list of user_id's of the friends of a set of focal users who are sampled from participants in a Twitter discourse. The function returns a list of tibbles.

Each tibble has two columns - one with the focal user's user_id and the second with user_id's of the focal users friends. Since every user has different number of friends, the number of rows in each tibble is different.

My problem: The accounts of some of the focal users are now non-existent due to reasons unknown. Because of this the list has empty tibbles which look like this:

> userFriends[[88]]
# A tibble: 0 x 0

A non-empty tibble looks like this:

> userFriends[2]
[[1]]
# A tibble: 32 x 2
                 user            user_id
                <chr>              <chr>
 1 777937999917096960           49510236
 2 777937999917096960           60489018
 3 777937999917096960         3190203961
 4 777937999917096960          118756393
 5 777937999917096960         2338104343
 6 777937999917096960          122453931
 7 777937999917096960          452830010
 8 777937999917096960           60937837
 9 777937999917096960 923106269761851392
10 777937999917096960          416882361
# ... with 22 more rows

I want my code to identify these empty tibbles and subset the list without these tibbles.

I used the nrow function on these tibbles to find the number of friends each focal user had.

nFriends <- as.numeric(lapply(userFriends, nrow))

I took the indices where this value is zero as the empty tibbles and removed them using subsetting technique as follows:

nullIndex <- nFriends!=0
userFriendsFinal <- userFriends[nullIndex]

This seems to work as of now. But this way I also removing users with zero friends (although very unlikely) along with users who no longer exist or accessible through the API. I want to make sure that I am removing only those who are not accessible or do not exist. Please help.

Cettt
  • 11,460
  • 7
  • 35
  • 58
Sunil Reddy
  • 45
  • 1
  • 7

2 Answers2

7

Hi you can use the discard function from the purrr package:

Here is small example:

library(purrr)
mylist <- list( a = tibble(n = numeric()),
      b = tibble(n = 1:4))
discard(mylist, function(z) nrow(z) == 0)
$b
# A tibble: 4 x 1
      n
  <int>
1     1
2     2
3     3
4     4
Cettt
  • 11,460
  • 7
  • 35
  • 58
1

We can use Filter with nrow, which will remove all entries with 0 number of rows, i.e.

Filter(nrow, userFriends)
Sotos
  • 51,121
  • 6
  • 32
  • 66