7

I am planning on using LibSVM to predict user authenticity in web applications. (1) Collect Data on particular user behavior(eg. LogIn time, IP Address, Country etc.) (2) Use Collected Data to train an SVM (3) Use real time data to compare and generate an output on level of authenticity

Can some one tell me how can I do such a thing with LibSVM? Can Weka be helpful in these types of problems?

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
ruwanego
  • 427
  • 2
  • 7
  • 18
  • Yes, Weka can be helpful, as it allows you to explore machine learning. Do you have any experience in that field? – Fred Foo Mar 10 '11 at 18:10
  • I am not very experienced in that.. But.. Can anybody tell me what I need to do here? May be steps I need to go through in performing such task? – ruwanego Mar 10 '11 at 18:50

1 Answers1

5

The three steps you mention are an outline of the solution. In some more detail:

  1. Make sure you get plenty of labeled data, i.e. behavior logs annotated with authentic/non-authentic. (Without labeled data, you get into the pretty advanced field of semisupervised learning, or must consider other solutions.)
  2. Design a number of features based on the data that you think predict authenticity well. Try the method and refine it until it works well enough by some statistical standard. Use ten-fold cross validation to assure you're not overfitting.
  3. LibSVM can output a probability estimate along with its answer; see section 8 of its manual.
Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • 1
    LibSVM can output probability estimates (run with flag -b 1) – Stompchicken Mar 11 '11 at 12:11
  • So.. just to clarify.. Is The **Probability Estimate** the extent that the captured instance (I'm talking about just **1** instance) matches the instances that are used for the training? Or in this context the probability of the current user to be the legitimate user? – ruwanego Mar 12 '11 at 18:03
  • It's an estimate, for each class (authentic or non-authentic) of the probability that the instance being classified belongs to that class. – Fred Foo Mar 12 '11 at 18:33
  • @ruwanego: if this answers your question then please click the checkmark next to my post. This is considered courtesy on SO and if you don't accept answers, people will be less willing to help you next time. – Fred Foo Mar 13 '11 at 16:52
  • Done!.. And.. I want to represent the above mentioned data in a file to be fed in to LibSVM. How should I format the file? Where can I find a good tutorial with examples on LibSVM input format? – ruwanego Mar 13 '11 at 17:57
  • @larsmans: Thank you! Can you answer the question and give me your ideas on the matter?. That will be really helpful! – ruwanego Mar 14 '11 at 14:42