15

Are there any examples of dataset in base R that contain missing values? I've been looking through each one in turn and also searched using google-nothing so far.

library(MASS)
data()

Edit: I know how to add missing values to a dataset in R, I just want to know if any such datasets exist.

John_dydx
  • 951
  • 1
  • 14
  • 27

3 Answers3

28

airquality is in base and has some NAs in it

> summary(airquality)
     Ozone           Solar.R           Wind             Temp           Month            Day      
 Min.   :  1.00   Min.   :  7.0   Min.   : 1.700   Min.   :56.00   Min.   :5.000   Min.   : 1.0  
 1st Qu.: 18.00   1st Qu.:115.8   1st Qu.: 7.400   1st Qu.:72.00   1st Qu.:6.000   1st Qu.: 8.0  
 Median : 31.50   Median :205.0   Median : 9.700   Median :79.00   Median :7.000   Median :16.0  
 Mean   : 42.13   Mean   :185.9   Mean   : 9.958   Mean   :77.88   Mean   :6.993   Mean   :15.8  
 3rd Qu.: 63.25   3rd Qu.:258.8   3rd Qu.:11.500   3rd Qu.:85.00   3rd Qu.:8.000   3rd Qu.:23.0  
 Max.   :168.00   Max.   :334.0   Max.   :20.700   Max.   :97.00   Max.   :9.000   Max.   :31.0  
 NA's   :37       NA's   :7                                                                      
pdb
  • 1,574
  • 12
  • 26
5

The VIM package has some nice examples of datasets with missing data. I use the sleep dataset from that package when I teach missing values imputation.

2

I would create my own numerical dataset with NA's. Here is one way to create a 10x10 data.frame called df, and replace values above 80 to NA.

df <- data.frame(matrix(data = sample(100,100,replace=TRUE), ncol = 10))
df[df>80] <- NA

Bonus, you can then inspect NA's visually using visdat package.

library(visdat)
vis_miss(df)
Masood Sadat
  • 1,247
  • 11
  • 18