How to calculate relevance score?

Question

I am trying to calculate relevance score using a review from a json file. Every time I tried to run my code, it will only say "indirect" for output. What am I doing wrong?

My code is below:

import joblib, requests, json, sklearn.metrics, sklearn.model_selection, sklearn.tree, time, math, textblob

import warnings
warnings.filterwarnings("ignore")

response = requests.get("https://appliance_reviews.json")

if response:
    data = json.loads(response.text)
    
    unique = []
    word = []
    for line in data:
        #print(line)
        
        review = line["Review"]
        blob = textblob.TextBlob(review)
        
        for word in blob.words:
            
            if word.lower() not in unique:
                unique.append(word.lower())
   
    for word in unique:
        a = 0
        b = 0
        c = 0
        d = 0
       
        for line in data:
           
            review = line["Review"]
            safety = line["Safety hazard"]
           
            if word in review.lower() and safety == 1:
                a += 1
            if word in review.lower() and safety == 0:
                b += 1
            if word in review.lower() and safety == 1:
                c += 1
            if word in review.lower() and safety == 0:
                d += 1
        
        try:
            rel_score = (math.sqrt(a + b + c + d) * ((a + d) - (c * b))) / math.sqrt((a + b) * (c + d))
        except:

            rel_score = 0
            
        if rel_score >= 4000:
            score.append(word)
    print(word)

Please provide the expected see [MRE - Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example). Show where the intermediate results deviate from the ones you expect. We should be able to paste a single block of your code into file, run it, and reproduce your problem. This also lets us test any suggestions in your context. — Prune, Apr 28 '21 at 22:36
Your posted code is not minimal: you've imported 10 packages to support a relevance classification. You haven't traced the intermediate results, and you've made it hard for us to do so by using generic variable names and not explaining your algorithm. — Prune, Apr 28 '21 at 22:36
How many words would you expect to score in total? An idea of the scale of the action might lead to different choices. — Joffan, Apr 28 '21 at 23:55

score 0 · Answer 1 · answered Apr 28 '21 at 23:22

word would just be the last entry in unique at the time you print it on the last line of code given, regardless of its scoring. You've just exited a for loop where word was the iterating variable.

Are you sure that you didn't want to print score, which seems to be intended to accumulate high-scoring words from unique?

Also I think your scoring is broken. For example as coded, a and c are always equal, as are b and d. "carpet" would affect the score of both "car", "pet" and indeed "carp".

As Prune mentions in comments, your bland choice of variable names makes understanding the purpose of the code difficult.

How to calculate relevance score?

1 Answers1