i get an NameError although i defined my variable

Question

hello my programmer friends... i'm doing my first NLP project that counts and shows 5 documents TFIDF. here's part of the code:

def IDF(corpus , unique_words):
    idf_dict = {}
    N = len(corpus)
    for i in unique_words:
        count = 0
        for sen in corpus:
            if i in sen.split():
                count = count+1
            idf_dict[i] = (math.log((1 + N) / (count+1))) + 1
    return idf_dict

def fit(whole_data):
    unique_words = set()
    if isinstance(whole_data, (list,)):
        for x in whole_data:
            for y in x.split():
                if len(y)<2:
                    continue
                unique_words.add(y)
            unique_words = sorted(list(unique_words))
            vocab = {j:i for i,j in enumerate(unique_words)}
    Idf_values_of_all_unique_words = IDF(whole_data,unique_words)
    return vocab, Idf_values_of_all_unique_words
vocabulary, idf_of_vocabulary = fit(corpus)

The word IDF in line 22 gives me a NameError. is it about positioning?

It would be nice if you show the line number on the code. There is no IDF on line 11 now. — Park, Jul 18 '22 at 06:02
do you think that `vocab` and `unique_words` are **always** defined inside `fit`? or that `corpus` exists when you call `fit` ? — DeepSpace, Jul 18 '22 at 06:10
Actually it was all about putting the Function in the right place... i dropped "IDF Function" inside "fit Function" and it works fine. thanks everyone. — Parsa, Jul 18 '22 at 13:19

Parsa · Accepted Answer · 2022-07-18T13:13:24.893

def fit(whole_data):
    def IDF(whole_data, unique_words):
        idf_dict = {}
        N = len(whole_data)
        for i in unique_words:
            count = 0
            for sen in whole_data:
                if i in sen.split():
                    count = count+1
                idf_dict[i] = (math.log((1 + N) / (count+1))) + 1
        return idf_dict

    unique_words = set()

    if isinstance(whole_data, (list,)):
        for x in whole_data:
            for y in x.split():
                if len(y) < 2:
                    continue
                unique_words.add(y)
        unique_words = sorted(list(unique_words))
        vocab = {j: i for i, j in enumerate(unique_words)}

        Idf_values_of_all_unique_words = IDF(whole_data, unique_words)
    return vocab, Idf_values_of_all_unique_words

vocabulary, idf_of_vocabulary = fit(corpus)

just like that!

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). — Community, Jul 19 '22 at 18:28

i get an NameError although i defined my variable

1 Answers1