1

I have a list of sentences and I want to be able to append only the sentences with 1 "PERSON" named entity using spaCy. The code I used was as follows:

test_list = []
for item in sentences: #for each sentence in 'sentences' list
  for ent in item.ents: #for each entity in the sentence's entities 
    if len(ent in item.ents) == 1: #if there is only one entity
      if ent.label_ == "PERSON": #and if the entity is a "PERSON"
        test_list.append(item) #put the sentence into 'test_list'

But then I get:

TypeError: object of type 'bool' has no len()

Am I doing this wrong? How exactly would I complete this task?

1 Answers1

1

You get the error because ent in item.ents returns a boolean result, and you can't get its length.

What you want is

test_list = []
for item in sentences: #for each sentence in 'sentences' list
    if len(item.ents) == 1 and item.ents[0].label_ == "PERSON": #if there is only one entity and if the entity is a "PERSON"
        test_list.append(item) #put the sentence into 'test_list'

The len(item.ents) == 1 checks if there is only one entity detected in the sentence, and item.ents[0].label_ == "PERSON" makes sure the first entity lable text is PERSON.

Note the and operator, both conditions must be met.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Thank you so much for the response. This better clarified what I needed to with my code. But may I ask, is there a way to include sentences with multiple entities, but only one "PERSON" entity? i.e. "John had gone to America and Europe many times." – Dawud Sayeed Jan 30 '22 at 14:20
  • 1
    @DawudSayeed I understand you only care to check the amount of PERSON entities, so you can use `person_ents = [e for e in item.ents if e.label_ == "PERSON"]` and then `if len(person_ents) == 1: test_list.append(item)` as the for loop body. – Wiktor Stribiżew Jan 30 '22 at 14:29