0

I have a dataframe where I want to replace all values in a column that contain the value '2018' with NULL.

I have a dataset where every value in a column is a list. There are NULLs included as well. One of the values is not a list and I want to replace it with a NULL. If I replace it with NA then the datatypes in that column are mixed.

If I have a column like below, how do I replace the value containing 2018 with NULL instead of NA?

 spend         actions 
 176.2         2018-02-24
166.66         list(action_type = c("landing_page_view", "link_click", "offsit...         
153.89         list(action_type = c("landing_page_view", "like", "link_click",...
156.54         list(action_type = c("landing_page_view", "like", "link_click",...
254.95         list(action_type = c("landing_page_view", "like", "link_click",...
   374         list(action_type = c("landing_page_view", "like", "link_click",...
353.29         list(action_type = c("landing_page_view", "like", "link_click",...
  0.41         NULL

Reproducible Example:

structure(list(spend = c("176.2", "166.66", "153.89", "156.54", 
"254.95", "374", "353.29", "0.41"), actions = list("2018-02-24", 
    structure(list(action_type = c("landing_page_view", "link_click", 
    "offsite_conversion.fb_pixel_add_to_cart", 
"offsite_conversion.fb_pixel_purchase", 
    "offsite_conversion.fb_pixel_search", 
"offsite_conversion.fb_pixel_view_content", 
    "post", "post_reaction", "page_engagement", "post_engagement", 
    "offsite_conversion"), value = c("179", "275", "212", "18", 
    "269", "1434", "1", "17", "293", "293", "1933")), .Names = c("action_type", 
    "value"), class = "data.frame", row.names = c(NA, 11L)), 
    structure(list(action_type = c("landing_page_view", "like", 
    "link_click", "offsite_conversion.fb_pixel_add_to_cart", 
    "offsite_conversion.fb_pixel_purchase", 
"offsite_conversion.fb_pixel_search", 
    "offsite_conversion.fb_pixel_view_content", "post_reaction", 
    "page_engagement", "post_engagement", "offsite_conversion"
    ), value = c("136", "3", "248", "101", "6", "237", "730", 
    "11", "262", "259", "1074")), .Names = c("action_type", "value"
    ), class = "data.frame", row.names = c(NA, 11L)), structure(list(
        action_type = c("landing_page_view", "like", "link_click", 
        "offsite_conversion.fb_pixel_add_to_cart", 
"offsite_conversion.fb_pixel_purchase", 
        "offsite_conversion.fb_pixel_search", 
"offsite_conversion.fb_pixel_view_content", 
        "post", "post_reaction", "page_engagement", "post_engagement", 
        "offsite_conversion"), value = c("95", "1", "156", "91", 
        "5", "83", "532", "1", "13", "171", "170", "711")), .Names = 
c("action_type", 
    "value"), class = "data.frame", row.names = c(NA, 12L)), 
    structure(list(action_type = c("landing_page_view", "like", 
    "link_click", "offsite_conversion.fb_pixel_add_to_cart", 
    "offsite_conversion.fb_pixel_purchase", 
"offsite_conversion.fb_pixel_search", 
    "offsite_conversion.fb_pixel_view_content", "post_reaction", 
    "page_engagement", "post_engagement", "offsite_conversion"
    ), value = c("178", "4", "243", "56", "4", "138", "437", 
    "19", "266", "262", "635")), .Names = c("action_type", "value"
    ), class = "data.frame", row.names = c(NA, 11L)), structure(list(
        action_type = c("landing_page_view", "like", "link_click", 
        "offsite_conversion.fb_pixel_add_to_cart", 
"offsite_conversion.fb_pixel_purchase", 
        "offsite_conversion.fb_pixel_search", 
"offsite_conversion.fb_pixel_view_content", 
        "post_reaction", "page_engagement", "post_engagement", 
        "offsite_conversion"), value = c("203", "2", "306", "105", 
        "7", "186", "954", "23", "331", "329", "1252")), .Names = 
c("action_type", 
    "value"), class = "data.frame", row.names = c(NA, 11L)), 
    structure(list(action_type = c("landing_page_view", "like", 
     "link_click", "offsite_conversion.fb_pixel_add_to_cart", 
    "offsite_conversion.fb_pixel_purchase", 
"offsite_conversion.fb_pixel_search", 
"offsite_conversion.fb_pixel_view_content", "post", "post_reaction", 
"page_engagement", "post_engagement", "offsite_conversion"
), value = c("241", "4", "320", "106", "3", "240", "789", 
"1", "17", "342", "338", "1138")), .Names = c("action_type", 
"value"), class = "data.frame", row.names = c(NA, 12L)), 
NULL)), .Names = c("spend", "actions"), row.names = c(NA, 
-8L), class = "data.frame")

My ultimate goal is to use this function with this dataset to make the action_types their own column. This function works when either a list or NULL is in the actions column:

fb_insights_all<-df %>%
  as.tibble() %>%
  filter(!map_lgl(actions, is.null)) %>%
  unnest() %>%
  right_join(select(df, -actions)) %>%
  spread(action_type, value)

Error: Each column must either be a list of vectors or a list of data frames [actions]
nak5120
  • 4,089
  • 4
  • 35
  • 94
  • This other post might help: https://stackoverflow.com/questions/26597573/set-na-and-cells-in-r-dataframe-to-null – Dania Mar 13 '18 at 22:14
  • I tried above and this is the error it gave me: `Error in `[<-.data.table`(`*tmp*`, , sapply(df, function(x) all(is.na(x))), : j must be vector of column name or positions` – nak5120 Mar 13 '18 at 22:23
  • 1
    *"I have a dataset where every value in a column is a list."* But why? That sounds awful. Could you please edit your question and provide some minimal sample data. I'm not sure I understand. – Maurits Evers Mar 13 '18 at 22:34
  • edited the question to have it make more sense – nak5120 Mar 13 '18 at 22:57
  • You can store a `NULL` in a `list`, in a `data.frame` column just fine - `dat$actions[1] <- list(NULL)` – thelatemail Mar 13 '18 at 23:02
  • And would that work with the final code below? I'm away from my computer and will check in about an hour – nak5120 Mar 13 '18 at 23:11
  • @NickKnauer - probably, try it and see. – thelatemail Mar 13 '18 at 23:13
  • I used this function and it didn't work: `dat$actions[1]<-ifelse(grepl("2018", dat$actions), list(NULL), dat$actions)` – nak5120 Mar 13 '18 at 23:55
  • In my actual dataset, there are multiple instances of a date appearing in that column – nak5120 Mar 13 '18 at 23:56
  • @MauritsEvers List columns are totally acceptable way to work in R. Data.frames fields are _already_ lists - the format just encourages use of simple vectors for a classic tablular layout. – geotheory Oct 02 '19 at 13:01
  • @geotheory Not sure why you comment on a post from 1.5 years ago. Yes, `list` columns in vanilla `data.frame`s are acceptable but are [inconvenient and often require some extra work](https://adv-r.hadley.nz/vectors-chap.html#tibble). `tibble`s (and other similar R structures, like `S4Vectors::DataFrame`s) expand on that, allowing for arbitrary R objects in columns. For what it's still worth, OP had a structure *"where every value in a column is a list"*, which is unnecessary and awkward. – Maurits Evers Oct 03 '19 at 01:00
  • I get your point. Certainly awkward, but I find data often arrives in quirky formats such as this, esp when you're parsing someone's strange idea of a sensible JSON structure or similar. – geotheory Oct 03 '19 at 10:27

1 Answers1

0

Without data to test this on, I'd try:

df$COL1<-ifelse(grepl("2018", df$COL1),"NULL",df$COL1)

As stated here NA functions more like what you seem to be trying to do, while NULL serves a different function. If you just want the value to just say "NULL" rather than function like NULL, treat it like a character value.

AnnVyrgyl
  • 56
  • 6