76

I have a vector of lists and I use unlist on them. Some of the elements in the vectors are NULL and unlist seems to be dropping them.

How can I prevent this?

Here's a simple (non) working example showing this unwanted feature of unlist

a = c(list("p1"=2, "p2"=5), 
      list("p1"=3, "p2"=4), 
      list("p1"=NULL, "p2"=NULL), 
      list("p1"=4, "p2"=5))
unlist(a)
 p1 p2 p1 p2 p1 p2 
 2  5  3  4  4  5 
zx8754
  • 52,746
  • 12
  • 114
  • 209
nico
  • 50,859
  • 17
  • 87
  • 112
  • It is confusing that `unlist` doesn't give any warning for `NULL` values and just drops them. I think would be useful information to throw an informative warning about NULLs. Especially that you can use further the results and some recycling behavior of the vectors might take place and one ends up with unexpected "correct' results, spinning around in circles trying to debug things while struggling to keep their sanity in balance :D – Valentin_Ștefan Nov 21 '22 at 18:35

4 Answers4

65

In this case (one level depth list) this should works too:

a[sapply(a, is.null)] <- NA
unlist(a)
# p1 p2 p1 p2 p1 p2 p1 p2 
#  2  5  3  4 NA NA  4  5
Marek
  • 49,472
  • 15
  • 99
  • 121
36

The issue here is that you can't have NULL in the middle of a vector. For example:

> c(1,NULL,3)
[1] 1 3

You can have NA in the middle though. You could could convert it to character and then back to numeric, which automatically converts the NULL values to NA (with a warning):

> b <- as.numeric(as.character(a))
Warning message:
NAs introduced by coercion 

then put the names back in, because they've been dropped by the previous operation:

> names(b) <- names(a)
> b
p1 p2 p1 p2 p1 p2 p1 p2 
2  5  3  4 NA NA  4  5 `
Fojtasek
  • 3,492
  • 1
  • 26
  • 23
  • 5
    On 3.2.2, it looks like as.numeric(as.character(NULL)) returns numeric(0). A new approach might be to use lapply(b, function(x) ifelse(is.null(x), NA, x)) – cylondude Sep 08 '15 at 23:28
  • For the approach suggested by @cylondude, substitute sapply for lapply (or use lapply with simplify = TRUE) to get a vector instead of a list. – nelliott Oct 24 '18 at 16:42
1

If you are dealing with a long complex JSON with several levels you should give this a try:

I extracted game log data from nba.com/stats web site. The problem is, some players have a NULL value for 3 point free throws (mostly centers) and jsonlite::fromJSON seems to handle NULL values very well:

#### Player game logs URL: one record per player per game played ####
gameLogsURL <- paste("http://stats.nba.com/stats/leaguegamelog?Counter=1000&Direction=DESC&LeagueID=00&PlayerOrTeam=P&Season=2016-17&SeasonType=Regular+Season&Sorter=PTS")

#### Import game logs data from JSON ####
# use jsonlite::fromJSON to handle NULL values
gameLogsData <- jsonlite::fromJSON(gameLogsURL, simplifyDataFrame = TRUE)
# Save into a data frame and add column names
gameLogs <- data.frame(gameLogsData$resultSets$rowSet)
colnames(gameLogs) <- gameLogsData$resultSets$headers[[1]]
elmaroto10
  • 516
  • 4
  • 12
-4

The correct way to indicate a missing value is NA (not NULL). Here is another version that is working:

   a = c(list("p1"=2, "p2"=5),
      list("p1"=3, "p2"=4),
      list("p1"=NA, "p2"=NA),
      list("p1"=4, "p2"=5))
  unlist(a)

p1 p2 p1 p2 p1 p2 p1 p2 
 2  5  3  4 NA NA  4  5 
gd047
  • 29,749
  • 18
  • 107
  • 146
  • 2
    thanks for the answer. Obviously I do not define the list by hand, it is returned by a function. Anyway changing the NULLs to NA before `unlist` seemed to do the trick. – nico Jun 07 '10 at 18:47
  • @nico If it's your function then you might consider rewriting it to return `NA` instead of `NULL`. take a look on help pages to `NA` and `NULL` to see differences between this two objects. – Marek Jun 07 '10 at 21:03
  • 2
    @Marek: No, it actually is a list returned by applying `coef` on a list of objects returned by `nls`. Some of these objects are NULL and `coef(NULL)` returns `NULL`... – nico Jun 07 '10 at 21:41