0

i have a problem, i cant calculate the tf-idf with my actual code.

This is an example of tf-idf:

$tfidf = $term_frequency *  // tf
        log( $total_document_count / $documents_with_term, 2); // idf

I have the total documents, but i need $documents_with_term and $term_frequency.

This is my actual code:

$frase = htmlspecialchars($_GET['frase'], ENT_NOQUOTES);

$sssql = $server_link->query("SELECT uDR.webTitulo, uDR.webDescripcion, uDR.webkeywords, uDR.weburl, SUM(uDR.priority) as SPriority
FROM (

(SELECT s1.webTitulo, s1.webDescripcion, s1.weburl, s1.webkeywords, $a as priority FROM webs s1 WHERE MATCH (webTitulo) AGAINST ('$frase'))

UNION

(SELECT s2.webTitulo, s2.webDescripcion, s2.weburl, s2.webkeywords, $b as priority FROM webs s2 WHERE MATCH (webkeywords) AGAINST ('$frase*' IN BOOLEAN MODE))

UNION

(SELECT s3.webTitulo, s3.webDescripcion, s3.weburl, s3.webkeywords, $c as priority FROM webs s3 WHERE MATCH (webDescripcion) AGAINST ('$frase'))) uDR

GROUP BY uDR.webTitulo, uDR.weburl, uDR.webDescripcion, uDR.webkeywords

ORDER BY SPriority DESC ");

$totalRows = $sssql->num_rows; //This is the $total_document_count

I have the $total_document_count, but i dont know how extract the TF and $documents_with_term.

How i can extract them?.

PSilvestre
  • 177
  • 2
  • 12
  • Check out this answer: http://stackoverflow.com/questions/23030234/how-to-search-a-corpus-to-find-frequency-of-a-string/24374866#24374866 – batgirl Jul 29 '14 at 19:26

1 Answers1

0
Community
  • 1
  • 1
Daniel Basedow
  • 13,048
  • 2
  • 32
  • 30