-1

I want to calculate recall and precision for which I need correct data records extracted, total number of data records and incorrect data records extracted.

I have input html pages and I am extracting useful data from it and generate output html page using a wrapper.

user2314737
  • 27,088
  • 20
  • 102
  • 114
  • You need something to compare your output to. If you are evaluating a retrieval system you need an 'ideal' (a.k.a Gold Standard) set of retrieved documents that you will compare against. In this case you would need to have a set of correct data records, probably made by hand. – jksnw Apr 17 '15 at 16:11
  • 1
    Please write your question in detail and also explain what are you trying to do. – Nilesh Kikani Apr 18 '15 at 12:51
  • I agree with @Nilesh, there could be more detail added to your question. Also, why the down vote on the answer? Perhaps a comment on why or an edit if there is something wrong. – jksnw Apr 18 '15 at 19:25

1 Answers1

-1

To calculate how many correct data records have been extracted you need to have a reference set of correct data records. The set of reference data is what you will compare your output to, the reference set is the ideal output your output should match. The reference set is also called a "gold standard" set.

The reference set may be created by hand or, if better IR systems exist for your purpose, may be made by another system.

To calculate number of correct data records extracted all you do is simply count how many records are in both your system output and the gold standard output.

Community
  • 1
  • 1
jksnw
  • 648
  • 1
  • 7
  • 19