I have collected tweets from Twitter API. The tweets are not labelled and I have no clue how to start with? All the tutorials have already labelled data. How to label data? Can labelling be done manually only? Any good tutorial answering my queries will be of great help.
Asked
Active
Viewed 51 times
-3
-
that is a really broad question, in your case, you may want to use Natural languages approaches to label your tweets, because depending on what kind approach you do. You may need thousands of labelled data. for example, you may try picking words with the higher TF-IDF in each of your tweets. – RomainL. Jun 12 '19 at 11:11
-
I want help in the form of good tutorial – user1992989 Jun 12 '19 at 12:22
-
Welcome to SO; please do take some time to read [What topics can I ask about here?](https://stackoverflow.com/help/on-topic), and notice that questions asking us to *recommend or find a book, tool, software library, tutorial or other off-site resource* are off-topic for SO. – desertnaut Jun 12 '19 at 15:43
1 Answers
0
I assume that when you extract the data from Twitter API, it's in the JSON format. Use the key, value pair as your dataframe heading and values.Now for the label part,it depends on what are you going with the dataset. If you want to do sentiment analysis then you need to manually mark the dataset(or just download pre-labeled twitter dataset from internet).
For reference here is a great tutorial on how to mine and deal with the raw data, getting insight and applying clustering algorithms. Hope it helps !

desertnaut
- 57,590
- 26
- 140
- 166

BIVEK KUMAR
- 11
- 2
-
-
If data is in .csv then label it manually either storing it in the dataframe or just seeing the data. By the way, the label doesn't matter while training of the model.And if you need code regarding this, please reply.Or share .csv for better picture. – BIVEK KUMAR Jun 13 '19 at 08:35