-1

I have a chat data of shape 500k rows. I want to replace or substitute multiple words entity [eg. NEW YORK, New York, new york, Newyork] with single entity as "New York" using python.

I tried to do this using regex, but it consumes too much time for processing. Also I have many such words. Is there any alternative method which consumes less time using Python?

Is there any good resource to study more about Spacy and Rasa API?

James Z
  • 12,209
  • 10
  • 24
  • 44
Kedar17
  • 178
  • 2
  • 14
  • You should post the regex code first, so that the regex-masters (not me) can verify that you're not doing anything less efficient there. If you're not, you probably want to pre-process your text with Cython. – tsorn Dec 01 '18 at 13:04

1 Answers1

0

You can provide, some simple example of you need to do? I mean example using some training object. You need to change the entity name or entity value?

About more docs to study rasa and spacy, both has a good documentations on his own domains(site/github).

About Rasa, you can find good things here:

  1. https://rasa.com/docs/nlu/
  2. https://medium.com/rasa-blog
  3. https://forum.rasa.com/

About SpaCy:

  1. https://spacy.io/usage/
  2. https://explosion.ai/blog/

Also, you can find more real examples on medium's posts

Renato Aguiar
  • 91
  • 1
  • 5