0

Initial Matrix contains 1 for known links (0 for unknown and missed entries) in a link prediction system. Outputs of Matrix Factorization are predicted values for missed entries to calculate AUC:

  1. Hide 20% of known links (set 20% of entries with 1 to 0 in the matrix )
  2. Sort output of Factorization and discard indexes used for train (80% of 1)
  3. Set N to number of hided values
  4. Get N top predicted values and check if they are hided values (class label set to 1) or not (class label set to 0)
  5. Compute AUC using N top predictions

I know perfcurve in matlab computes AUC, but I need to be sure about the above process to provide labelled data for perfcurve.

Any comment is really appreciated.

mkierc
  • 1,193
  • 2
  • 15
  • 28
nourani
  • 71
  • 1
  • 6
  • Which step are you struggling with? – Kostya Nov 18 '14 at 10:42
  • I am not sure about choosing N top score from m*k factorized value. – nourani Nov 18 '14 at 10:55
  • I am not sure about choosing N top score from i.e 1 million factorized value. expecting that N top link be the same as hided values is reseanoble or not? (we may have 1 million entries with 4000 known '1' others set to '0' and #hided values=N=1000 ) – nourani Nov 18 '14 at 11:03

0 Answers0