2

I am dealing with time stamped event sequences that are 300+ events long. This data is similar to web logs, where users hit different pages of a website at different times. One sequence may be one web session and each event is a user action (visit page, click button, etc).

I first used the TSE format. When trying to find subsequences using seqefsub() TraMineR hung. I set maxk = 5 and it worked (this limits the length of subsequences to be searched for to 5 events). However, maxK 6 or higher also hangs. Not sure why this sudden drop off. Also, when I pruned the event sequences to only be 15 events in length everything completed fine. So clearly event sequence length is an issue here.

Is there a different format that is more robust to sequence length, e.g. STS? Are there any other recommendations for dealing with sequences of this length in TraMineR?

jojo
  • 83
  • 7

1 Answers1

1

The problem has nothing to do with the format used to enter the sequences.

TraMineR has only a rudimentary algorithm for searching subsequences.

I would suggest you look at more appropriate tools for your problem. Consider for instance the R package arulesSequences.

Gilbert
  • 3,570
  • 18
  • 28
  • The implication being that it deals better with longer sequences? Thanks, I'll give it a try. – jojo Aug 28 '16 at 17:57