I have 16 dataframes I am trying to quality check and delete poor quality rows in R. I already know of lapply() and have used it for simpler wrangling problems to apply the same thing to all my dataframes at once, but for whatever reason I'm having a mental block currently.
The format of each individual dataframe is like so, where every other column contains a "flags" column. The flags column contains strings of values. If any of the values in the string are a 4, I want to filter those rows out of the dataframe.
head(df)
timestamp wind_speed_max wind_speed_max_flags wind_speed_mean
1 UTC meters per second NAN meters per second
2 data logger Airmar WS-200WX NAN Airmar WS-200WX
3 6/2/2015 15:46 7.6 1 1 4 1 5.12
4 6/2/2015 16:01 7.2 1 1 1 1 5.16
5 6/2/2015 16:16 8.1 1 1 1 1 5.97
6 6/2/2015 16:31 8.5 1 1 1 1 5.909
wind_speed_mean_flags wind_direction_mean wind_direction_mean_flags
1 NAN degrees NAN
2 NAN Airmar WS-200WX NAN
3 1 1 1 1 57.14 1 2 1 2
4 1 1 1 1 61.64 1 2 1 4
5 1 1 1 1 68 1 2 1 2
6 4 1 1 1 73.14 1 2 1 2
I know I can try to grep("flags") for the column names, and I also think I could use a similar grep method to filter out the strings containing a 4? Perhaps using some Boolean operators. But I am struggling to piece all of this together to retain the rest of the data, and to ideally perform this at the same time for all 16 dataframes for example lapply(df_list, function(x) <insert code that can filter out flags with 4s for each x dataframe>)