R normalize a dataset

Question

I have a dataset that looks like this

> dput(events.seq)
structure(list(vid = structure(1L, .Label = "2a38ebc2-dd97-43c8-9726-59c247854df5", class = "factor"), 
    deltas = structure(1L, .Label = "38479,38488,38492,38775,45595,45602,45606,45987,50280,50285,50288,50646,54995,55001,55005,55317,59528,59533,59537,59921,63392,63403,63408,63822,66706,66710,66716,67002,73750,73755,73759,74158,77999,78003,78006,78076,81360,81367,81371,82381,93365,93370,93374,93872,154875,154878,154880,154880,155866,155870", class = "factor"), 
    events = structure(1L, .Label = "mousemove,mousedown,mouseup,click,mousemove,mousedown,mouseup,click,mousemove,mousedown,mouseup,click,mousemove,mousedown,mouseup,click,mousemove,mousedown,mouseup,click,mousemove,mousedown,mouseup,click,mousemove,mousedown,mouseup,click,mousemove,mousedown,mouseup,click,mousemove,mousedown,mouseup,click,mousemove,mousedown,mouseup,click,mousemove,mousedown,mouseup,click,mousemove,mousedown,mouseup,click,mousemove,mousedown", class = "factor")), .Names = c("vid", 
"deltas", "events"), class = "data.frame", row.names = c(NA, 
-1L))

I need to normalize it to this structure:

> dput(test)
structure(list(vid = structure(c(1L, 1L, 1L), .Label = "2a38ebc2-dd97-43c8-9726-59c247854df5\n+ ", class = "factor"), 
    delta = c(38479, 38488, 38492), c..mousemove....mousedown....mousup.. = structure(c(2L, 
    1L, 3L), .Label = c("mousedown", "mousemove", "mousup"), class = "factor")), .Names = c("vid", 
"delta", "c..mousemove....mousedown....mousup.."), row.names = c(NA, 
-3L), class = "data.frame")

Any help appreciated. I did try to use strplit, the problem us that I want to split twice at the same time on second and third columns (which are always sync in their length)

score 0 · Answer 1 · answered Apr 17 '16 at 11:06

0

Try this:

z <- with(x, data.frame(
  deltas = strsplit(as.character(deltas), split = ",")[[1]],
  events = strsplit(as.character(events), ",")[[1]]
))
head(z)

The result:

  deltas    events
1  38479 mousemove
2  38488 mousedown
3  38492   mouseup
4  38775     click
5  45595 mousemove
6  45602 mousedown

answered Apr 17 '16 at 11:06

Andrie

176,377
47
447
496

I needed the vid also so I added this line: z <- with(events.seq, data.frame( vid = rep(vid,length(strsplit(as.character(deltas), split = ",")[[1]])), deltas = strsplit(as.character(deltas), split = ",")[[1]], events = strsplit(as.character(events), ",")[[1]] )) However, I ran into an issue of ordering - I noticed that after normalization the ordered is scrambled - I need to keep the source ordering of evens (according to their relative position in the original lists). Any idea how to keep the source ordering ? – Nir Regev Apr 17 '16 at 11:46
BTW, @Andrie - why did you add [[1]] ? that restrict the parsing to only the first row in the df – Nir Regev Apr 17 '16 at 13:39

R normalize a dataset

1 Answers1