0

My dataset contains a Likert item containing how energetic the participants were at that moment rated from 0-6. Where 0 = not energetic at all and 6 = very energetic. I have to investigate if these scores actually differ from one another based on the data. If 0 and 1 do not differ from eachother, I have to combine these two levels into one and so on. So at the end I might have 2 or 4 levels instead of 6.

I have tried applying classification algorithms to the data to see if a model classifying '0' would give an error rate when classifying '1'. Unfortunately, this did not work as I wanted. Is this actually possible?

My question is if someone knows how I can best investigate if there is indeed a difference between those 6 levels or whether I can combine some of them based on differences (or not) in the data of those levels.

Marly
  • 3
  • 2
  • So you have a response that is Likert, and other variables that can predict this response? The machine learning method is a bit indirect. Because you are trying to show you cannot build a model that distinguishes "0" from "1" which then shows "0" is similar to "1", given your data. It might be that you did not transform the variables well or you used the wrong model. You can try unsupervised learning methods, say kmeans to see how "0" labeled is different from "1" labelled. – StupidWolf Nov 05 '19 at 14:38
  • @StupidWolf Yes I have a response variable that is Likert and I have predictor variables which are numeric. Do you mind explaining how I can use kmeans to see if there are differences (or not) between score '0' and score '1'? – Marly Nov 06 '19 at 10:32
  • Sorry maybe kmeans is a bad example. If your predictors are not too quirky, do a pca for all the entries that are 0 and 1. Then plot the 1st two PCs out and color them according to 0 or 1.. So if in this plot, you don't see any separation between 0 and 1, most likely there isn't much – StupidWolf Nov 06 '19 at 23:21

0 Answers0