-4

My file looks like this:

review/summary: Beautiful basic pump... review/text: ... but not enough sizes or colors. Fits true to size on my size 8-1/2 feet.Bottom soles are completely slick... needs some kind of texturing or tread to help prevent slipping. review/text: It's amazing.Firstly, this one is not the original Gil Zero, but the Gil Zero TD, which means it do not have any technique.However, it's the most comfortable sneaker I've ever know. Without the expensive technique, it's mid-sole get more soft and much more durable. And with its upper changed to the real leather, it's became more fit able for the foot. This changes makes it even a better sneaker than the expensive original one, just for the great design of a real great sneaker, but not for the useless, for us common people not superstar, technique. And with it on the court, I found it enough cushion and it could give you more speed, excellent one for a guard or small forward.

I want to extract strings such as quick service , excellent service , amazon is great, excellent customer service

My code is looks like this:

def ethos(file):
    f = open(file)
    raw = f.read()
    tokens = nltk.sent_tokenize(raw)
    text = nltk.Text(tokens)
    sents = []
    matching_strings = ['thanks amazon' , 'great service' , 'reasonable shipping time' , 'quick service']
    for tokens in text:
        if tokens in matching_strings:
            sents.append(tokens)
    return sents

My output is blank, kindly let me know how to approach it correctly, I'm very new to language processing

Noelkd
  • 7,686
  • 2
  • 29
  • 43

1 Answers1

0

I've never used nltk, but I'll make a guess at the solution. Since your tokens are sentences, you need to look for the matching strings in the token and not the other way around as you have it now. Your for loop should look like this:

for tokens in text:
    for match in matching_strings:
        if match in tokens:
            sents.append(tokens)
            break
return sents
djhoese
  • 3,567
  • 1
  • 27
  • 45