I am trying to use Google Cloud Natural Language API to classify/categorize tweets in order to filter out tweets that are not relevant to my audience (weather related). I can understand it must be tricky for an AI solution to make a classification on a short amount of text but I would imagine it would at least have a guess on text like this:
Wind chills of zero to -5 degrees are expected in Northwestern Arkansas into North-Central Arkansas extending into portions of northern Oklahoma during the 6-9am window . #arwx #okwx
I have tested several tweets but only very few get a categorization, the rest gets no result (or "No categories found. Try a longer text input." if I try it through the GUI).
Is it pointless to hope for this to work? Or, is it possible to decrease the threshold for the categorization? An "educated guess" from the NLP-solution would be better than no filter at all. Is there an alternate solution (outside training my own NLP-model)?
Edit: In order to clarify:
I am, in the end, using the Google Cloud Platform Natural language API in order to classify tweets. In order to test it I am using the GUI (linked above). I can see that quite few of the tweets I test (in the GUI) gets a categorization from GCP NLP, i.e. the category is empty.
The desired state I want is for GCP NLP to provide a category guess of a tweet text, rather than providing an empty result. I assume the NLP model removes any results with a confidence less than X%. It would be interesting to know if that threshold could be configured.
I assume the categorization of tweets must have been done before, and if there is any other way to solve this?
Edit 2: ClassifyTweet-code:
async function classifyTweet(tweetText) {
const language = require('@google-cloud/language');
const client = new language.LanguageServiceClient({projectId, keyFilename});
//const tweetText = "Some light snow dusted the ground this morning, adding to the intense snow fall of yesterday. Here at my Warwick station the numbers are in, New Snow 19.5cm and total depth 26.6cm. A very good snow event. Photos to be posted. #ONStorm #CANWarnON4464 #CoCoRaHSON525"
const document = {
content: tweetText,
type: 'PLAIN_TEXT',
};
const [classification] = await client.classifyText({document});
console.log('Categories:');
classification.categories.forEach(category => {
console.log(`Name: ${category.name}, Confidence: ${category.confidence}`);
});
return classification.categories
}