I have sentences with keywords, of which I want to identify what are the most common occurrences. Suppose I have the following dataset:
df <- data_frame(word1 = c("director", "John", "Peter", "financial", "setting", "board"),
word2 = c("board", "seat", "outsider", "independent", "irrelevant", "dissident"),
word3 = c("independent", "director", "flaw", "yes", "oversight", "John"),
word4 = c("outsider", "independent", "dependence", "poorly", "material", "seat"),
n = c(6, 3, 2, 2, 1, 1))
I want to identify what words appear (columns "word1-4") and how often (column "n") whenever the row contains the keyword "director". This analysis should yield something similar to:
director_analysis <- data_frame(test1 = c("director", "director", "director", "director", "director", "director"),
test2 = c("board", "independent", "outsider", "John", "seat", "independent"),
n = c(6, 6, 6, 3, 3, 3))
This last dataframe informs for each row that has a director instance (in whichever column), what were the strings in the adjacent columns and their "n" value.
What is the most concise way of doing this?