Please suggest me a downloadable English corpus that contains informal, playful words such as 'gonna', 'LOL' and 'wanna'
Asked
Active
Viewed 297 times
0
-
Twitter is full of informal language. You can stream as much as you want in real time, but unfortunately you are not allowed to distributed the data you collect. – mbatchkarov Aug 16 '15 at 09:47
-
No, what I meant is the availability of a dictionary (corpus/ lexicon) for informal, playful words that can be used for researches – Aug 16 '15 at 10:02
2 Answers
1
I don't know such lexicon but you can try to do this, alternatively:
- Get the vocabulary V1 of Twitter or other web and chat corpus.
- Get the vocabulary V2 of literary corpus.
The lexicon you want might be V1 \ V2 i.e. all the words of V1 which are not in V2.
Using Python, NLTK provides corpora (see nltk.corpus.webtext
). Moreover, as @mbatchkarov said in the comments: Twitter is full of informal language.

clemtoy
- 1,681
- 2
- 18
- 30
0
Use 'NetLingo'. They have a rich content :)