0

My dataset structure:

Text: 'Good service, nice view, location'
Tag: '{SERVICE#GENERAL, positive}, {HOTEL#GENERAL, positive}, {LOCATI
ON#GENERAL, positive}'

And the point here is that I don't know how can I structure my data frame. If you have any recommendations, these will be really nice to me. Thank you.

Lắc Lê
  • 27
  • 6
  • I am assuming that you have 3 attributes to classify: SERVICE, HOTEL, LOCATION is that correct or there are more options? – Antoan Milkov Jun 22 '19 at 08:49
  • They also got Room, Food&Drink, Facilities ,.. I did not know extactly how much they were because of lacking information how did they structured their database, I just pointed you some others i found in their supplying database. – Lắc Lê Jun 22 '19 at 08:54

1 Answers1

0

Separate adjectives (good, bad, etc) from the hotel attributes (service, view, location). You can start from creating a custom dictionary and automatically detect and leverage new words as categories. You could use some name entity recognition to do so, here some articles:

https://towardsdatascience.com/named-entity-recognition-with-nltk-and-spacy-8c4a7d88e7da https://towardsdatascience.com/a-review-of-named-entity-recognition-ner-using-automatic-summarization-of-resumes-5248a75de175

Personally I have used the standford one, pretty cool