0

I want to use a "Likert" type data set for ML. I changed the data set from strongly disagree, .... to 1 to 5 scale. I want to work on them (for example checking the correlation matrix and feature selection methods.) What should be the data type in R for these purpose? I want that R see the data as label and not as number. What should I do for that? I tried to change it to factor, but it has a lot of variables (columns.) how should i call the .csv that consider the numbers as label/factor not as number.

Amirreza
  • 1
  • 2
  • `?factor` should give you what you need. Bear in mind that there are other ways (modified ridits, or arbitrary scores for example) of encoding factors other than a simple 1:n scale. The encoding you choose will influence your correlations and other summaries. – Limey May 04 '22 at 07:41
  • I tried to convert it to factor with these two methods: as.factor(mydata) but it eliminate the variables. "mydata" has 54 variables as column – Amirreza May 04 '22 at 07:44
  • Likert data is ordinary scaled and should not be treated as numeric. `factor(ordered=TRUE)` should be used. – danlooo May 04 '22 at 07:44
  • @danlooo: did you mean "ordinally"? – Limey May 04 '22 at 07:45
  • Sorry, yes, It's a typo – danlooo May 04 '22 at 07:46
  • I called my data: mydata_students <- read.csv("C:/Users/Tehran it/Desktop/Students3.1.csv", header = TRUE , stringsAsFactors = TRUE) How should I change it or how should I use factor(ordered=true)? – Amirreza May 04 '22 at 07:47
  • The fact your data is in a csv file is probably irrelevant. We can't access your c drive, so please post the output from `dput(mydata_students)` (or `dput(head(mydata_students))` if your data frame is large) to your question so that we can access your data, test our code and provide you with a reliable solution. – Limey May 04 '22 at 07:52
  • @danlooo I called my data: mydata_students <- read.csv("C:/Users/Tehran it/Desktop/Students3.1.csv", header = TRUE , stringsAsFactors = TRUE) How should I change it or how should I use factor(ordered=true)? – Amirreza May 04 '22 at 07:54
  • @Amirreza Do e.g. `mydata_students <- read.csv("file.csv") |> dplyr::mutate(col1 = factor(col1, ordered = TRUE)` – danlooo May 04 '22 at 07:56
  • @limey Thank you for your note. This is my first question here. x<-dput(head(mydata_Professors_0)) structure(list(University = structure(c(7L, 2L, 2L, 2L, 2L, 2L ), .Label = c("1", "2", "3", "4", "5", "6", "7"),, .Label = c("1", "2", "3", "4", "5"), class = "factor"), A4 = structure(c(2L, L, 4L, 3L, 5L, 5L, 5L), .Label = c("1", "2", "3", "4", "5"), class = "factor"), B4 = structure(c(4L, 3L, 4L, 2L, 2L, 2L), .Label = c("2", "3", "4", "5"), class = "factor"), B5 = structure(c(4L, 4L, 2L, 3L, 4L, 4L), .Label = c("1", "..... – Amirreza May 04 '22 at 07:58
  • @Limey I should say that I do not know what is the"L" near each number!! – Amirreza May 04 '22 at 08:00
  • @danlooo I want to convert all my data to factor not just a column (I have 54 column). my first variable is A1 and the 54 number of my variable is I2. dplyr::mutate(A1~I2 = factor(col1, ordered = TRUE)) but this code doesn't work – Amirreza May 04 '22 at 08:04
  • `mutate(across(everything(), ~ factor(.x, ordered = TRUE))`. However, this is dangerous, because the ordering itself is not defined in the csv. In case of a text character column, alpha lexical sorting will be assumed. – danlooo May 04 '22 at 08:08
  • I called my data with dput(mydata) I saw: ... G7 = structure(c(3L, 3L, 3L, 3L, 3L, 5L), G8 = structure(c(3L, 3L, 2L, 4L, 3L, 4L), .Label = c("1", "2", "3", "4", "5"), class = "factor") are they ok for my purpose?I wanted factor and they are factor.@danlooo @limey – Amirreza May 04 '22 at 08:13

0 Answers0