0

Say I have piece of text like:

Apple was founded in 1976 by Steve Jobs, Steve Wozniak and Ronald Wayne to develop and sell Wozniak's Apple I personal computer. It was incorporated by Jobs and Wozniak as Apple Computer, Inc. in 1977, and sales of its computers, among them the Apple II, grew quickly. Apple Computer, Inc. was incorporated on January 3, 1977, without Wayne, who had left and sold his share of the company back to Jobs and Wozniak for $800 only twelve days after having co-founded Apple.

Here "Jobs", "Wozniak", "Wayne" refer to "Steve Jobs", "Steve Wozniak" and "Ronald Wayne" respectively.

How do I resolve the text to something like

Apple was founded in 1976 by Steve Jobs, Steve Wozniak and Ronald Wayne to develop and sell Steve Wozniak's Apple I personal computer. It was incorporated by Steve Jobs and Steve Wozniak as Apple Computer, Inc. in 1977, and sales of its computers, among them the Apple II, grew quickly. Apple Computer, Inc. was incorporated on January 3, 1977, without Robert Wayne, who had left and sold his share of the company back to Steve Jobs and Steve Wozniak for $800 only twelve days after having co-founded Apple.

Replacing "Jobs" with "Steve Jobs" is obviously what need to be done but how do I detect that there is "Jobs" in the text that corresponds to "Steve Jobs".

(Steve Jobs and Jobs are detected as separate named entities)

  • quick and dirty, maybe a chained replace stmnt, like `.replace('Steve Jobs', 'Jobs').replace('Jobs', 'Steve Jobs')` – rv.kvetch Jan 26 '22 at 17:20
  • @rv.kvetch, yes that would work but how would I detect that there is a 'Jobs' in the text that corresponds to 'Steve Jobs' – Shrawan Sai Jan 26 '22 at 17:27
  • do you mean, how do you find the proper nouns in the text? for example Steve Jobs – rv.kvetch Jan 26 '22 at 17:33
  • 1
    @rv.kvetch, I have detected all named entities (including "Steve Jobs" and "Jobs") from the text already. I'm stuck on how to conclude that "Jobs" is part of "Steve Jobs" and is not a new named entity by itself. – Shrawan Sai Jan 26 '22 at 17:38
  • 1
    How will you deal with `"Steve Jobs said, 'Jobs are the most important goal...'"`? – JonSG Jan 26 '22 at 17:44
  • 1
    @JonSG I'm using a trained NER model that will detect the differences between proper nouns and non proper nouns. The issue arises for example, when I am talking about a family If the text is: `Thomas Smith is Elizabeth Smith's husband. Elizabeth Smith is from London. She is new to Delhi and is looking for help. Smith is sure she'll find someone to teach her the local language` In this case 'Smith' in the last sentence is Elizabeth Smith and not Thomas – Shrawan Sai Jan 26 '22 at 17:55
  • Please provide enough code so others can better understand or reproduce the problem. – Community Feb 04 '22 at 14:16

0 Answers0