0

I have a collection of tweets that express Yes or No opinion about a referendum. I also have two groups of Politicians who support Yes and No. Similarly I have two sets of words which express Yes and No opinions. These words were extracted using graph analysis.

Now I have to decide whether a particular tweet belongs to Yes opinion or No opinion. How can I come up with a Query ?

I have thought of the following to return Yes documents:

(Yes_Politician1 OR Yes_Politician2 OR Yes_Politician3 OR...) AND (Yes_Word1 OR Yes_Word2 OR Yes_Word3....) 

Do you think the above query would work? I should also tell that some words might belong to both Yes and No word sets.

Siddhant Tandon
  • 651
  • 4
  • 15

1 Answers1

0

" I should also tell that some words might belong to both Yes and No word sets."

Well, then no, the above would not work if, lets say for "Yes_Politician3" his "No_Word3" overlaps with "Yes_Word1" of others. Then, wouldn't your if-statement categorize his statement in the wrong group? Or do you mean some other kind of the words "belonging to both sets"?

JohanLarsson
  • 195
  • 1
  • 7
  • Yes you are right. Maybe I can add some heuristic like counting how many times a Yes_Politician is mentioned in the tweet ? Or a Yes_word is mentioned in the tweet ? – Siddhant Tandon Aug 10 '18 at 08:37
  • @SiddhantTandon Well, aslong as you are sorting on a GROUP of politicians and a GROUP of words, then it can't really work if the words can be in both sets. Either place the words definently, or sort on ONE politician and HIS words and do that for all. I'm not familiar with the specifics of what you are doing, so I might be wrong, but atleast to me, that is how I think about similar scenarios – JohanLarsson Aug 10 '18 at 08:40
  • unfortunately i cant do that. These words were extracted using some graph analysis. So I dont have list of words for every politician, I just have a big collection of words extracted using the entire tweets dataset. And since the graph analysis can return me many words common to both yes and no groups, then i cant do what you suggest. – Siddhant Tandon Aug 10 '18 at 08:46