I have a data frame of item lists, where each row in the data frame contain LHS and RHS association rules with the corresponding support, confidence and lift. here's the data:
structure(list(rules = structure(c(13L, 4L, 28L, 1L, 24L, 15L
), .Label = c("{butter,jam} => {whole milk}", "{butter,rice} => {whole milk}",
"{canned fish,hygiene articles} => {whole milk}", "{curd,cereals} => {whole milk}",
"{domestic eggs,rice} => {whole milk}", "{grapes,onions} => {other vegetables}",
"{hamburger meat,bottled beer} => {whole milk}", "{hamburger meat,curd} => {whole milk}",
"{hard cheese,oil} => {other vegetables}", "{herbs,fruit/vegetable juice} => {other vegetables}",
"{herbs,rolls/buns} => {whole milk}", "{herbs,shopping bags} => {other vegetables}",
"{liquor,red/blush wine} => {bottled beer}", "{meat,margarine} => {other vegetables}",
"{napkins,house keeping products} => {whole milk}", "{oil,mustard} => {whole milk}",
"{onions,butter milk} => {other vegetables}", "{onions,waffles} => {other vegetables}",
"{pastry,sweet spreads} => {whole milk}", "{pickled vegetables,chocolate} => {whole milk}",
"{pork,butter milk} => {other vegetables}", "{rice,bottled water} => {whole milk}",
"{rice,sugar} => {whole milk}", "{soups,bottled beer} => {whole milk}",
"{tropical fruit,herbs} => {whole milk}", "{turkey,curd} => {other vegetables}",
"{whipped/sour cream,house keeping products} => {whole milk}",
"{yogurt,cereals} => {whole milk}", "{yogurt,rice} => {other vegetables}"
), class = "factor"), support = c(0.00193187595322827, 0.00101677681748856,
0.00172852058973055, 0.00101677681748856, 0.00111845449923742,
0.00132180986273513), confidence = c(0.904761904761905, 0.909090909090909,
0.80952380952381, 0.833333333333333, 0.916666666666667, 0.8125
), lift = c(11.2352693602694, 3.55786275006331, 3.16819206791352,
3.26137418755803, 3.58751160631383, 3.17983983286908)), .Names = c("rules",
"support", "confidence", "lift"), row.names = c(NA, 6L), class = "data.frame")
What I need is to structure these rules into a wide format, where for each item in each LHS part of the rules will have a designated column with a value of 1 (to indicate that rule has that item in its LHD part), the same goes for the RHS of the rules, e.g. taking the 2 first rules:
{liquor,red/blush wine} => {bottled beer} 0.0019 0.90 11.2
{curd,cereals} => {whole milk} 0.0010 0.91 3.6
The result should be a data frame that looks like:
'rules_id' 'lhs_liquor' 'lhs_red/blush wine' 'lhs_curd' 'lhs_cereals' 'rhs_bottled beer' 'rhd_whole milk' 'support' 'confidence' 'lift'
1 1 1 0 0 1 0 0.0019 0.90 11.2
2 0 0 1 1 0 1 0.0010 0.91 3.6
As I am new to R and stack overflow please let me know if the question is not well defined Any help appreciated