Set NA and "" Cells in R Dataframe to NULL

Question

I have a Dataframe in R where certain rows in some of the columns have values that are NA or empty strings "". I want to convert these to NULL values.

So I need any cells in my data frame that are NA or "" to be NULL. How can I do this?

When I try:

DF[ , DF$Column == NA] <- NULL

or

DF[ , DF$Column == ""] <- NULL

I get the error: missing values are not allowed in subscripted assignments of data frames

If I try:

DF[ , is.na(DF$Column)] <- NULL

I get the error: duplicate subscripts for columns

If I try:

is.na(DF$Column) <- NULL

or

DF[DF == NA] <- NULL

I dont get any errors, but nothing changes in my dataframe.

This is a misuse of a NULL value in R. Vectors (which is essentially a column in a data.frame) cannot contain a mixture of observed values and NULL values. In R, we prefer NA to NULL to represent missing data. NULL values are used to remove elements from columns even entire columns. — MrFlick, Oct 27 '14 at 22:14
Got it, thanks! So I will eventually be loading this data set into a SQL DB. Will SQL interpret NA as NULL? — brno792, Oct 27 '14 at 22:35
Depends how you do it. If you use `RODBC::sqlSave`, yeah, I'm pretty sure that's how it works. — Gregor Thomas, Oct 27 '14 at 22:37

score 1 · Answer 1 · answered Oct 28 '14 at 01:21

There really is no NULL value in a vector. NA is the placeholder. If you want to remove the entire column (which is what assigning NULL would do) when its values are all NA then this would succeed:

df[ , sapply(df, function(x) all(is.na(x)) ) ] <- NULL

If you want to construct an object where you keep only those rows with no NA values:

df[ apply(df,1, function(rw) !any(is.na(rw)) ) , ]

Set NA and "" Cells in R Dataframe to NULL

1 Answers1

Linked

Related