0

I am exporting a dataframe to R I want to set a condition so that if the column header matches any of the following name below

Data_design <- c( "age", "sex", "font")

Then an error is flagged in the condition if statement.

For instance, if I import the following table (busa) with the column header ( age, sex, location, geography)

The condition would give an error because the columns do not have the required column header which is age sex and font.

Please can anyone help me with this? The code I have been able to do is below

if (sum(str_sort(colnames(busa)) == str_sort(Data_design)) <length(Data_design)){
print("Failed import structure test")
break
}

But I am getting error.

Jonatino
  • 63
  • 10

2 Answers2

2

If understand, you want a test that only passes if all the data frame column names are in your Data_design - is that right?

If so you can check if the column names are in your list as follows:

names(busa) %in% Data_design

[1]  TRUE  TRUE FALSE FALSE

To distil that to a single test:

all(names(busa) %in% Data_design)
[1] FALSE
Paul Stafford Allen
  • 1,840
  • 1
  • 5
  • 16
0

You could try grepl. This will match the pattern of "age", "sex", "font" and will only return true if they are the only string in the column. Otherwise it will return false. Then invert the search with ! and return the warning message.

# make "^age$|^sex$|^font$"
patternSearch <- paste0("^",Data_design, "$", collapse = "|")

if(any(!grepl(patternSearch, x = names(busa)))) {
  warning("Failed import structure test")
}
# Warning message:
# Failed import structure test 

EDIT: To be fair Paul's solution is the faster more elegant version to solve the described problem.

dparthier
  • 193
  • 6