1

I am trying to perform a latent class analysis in R but I have some variables that are both continuous and categorical. In addition I have 52 states or rows and I am trying to have 52 latent class or subgroups. I started to write the code in R but I am getting an error. Here is the error : Error in contrasts<-(*tmp*, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels

Here is my R code

#Getting data into R
library(haven)
Component_3_database_11022018 <- read_sav("C:/Users/gaurelien/WRMA/APS-TARC - Documents/Evaluation/Component 3 Research Study/Data Analysis/SPSS/Source/Component 3 database 11022018.sav")
#Create a subset of the full data set reduced to 52 states
LCA<-subset(Component_3_database_11022018, State52==1)
#Loading packages
library(mclust)
library(poLCA) # only categorical indicators
library(scatterplot3d)
library(MASS)
library(orthopolynom)
library(polynom)
library(nlsem)
library(nnet)
library(Rsolnp)
library(depmixS4)

#Latent class Modeling with component 3 data
# Construcution of the dependent Mixture Models
#To avoid time-consuming mistakes in model specification, the analysis involves two steps: 
#construction of a model with mix function and fitting it with fit function. family argument 
#of mix function allows specifying a type of observed variables – whether they are continuous, nominal, 
#or count by adding to a list corresponding distribution name, i.g. guassian or multinomial.
model_definition <- mix(list(AgencyLocation ~1, GeographicStructure ~1 , EligibilityCode ~1, 
  Maltreatment_Definitions_group ~ 1 ,ratio_report_per_investigator ~ 1,
  census_TotalPop ~ 1, percent_belowpovertylevel_12months ~1),
  family=list(multinomial(), #For every corresponding 
  multinomial(),  #  indicator a family of distribution 
  multinomial(),
  multinomial(),
  multinomial(),
  multinomial(),
  multinomial()), # should be indicated in the list.
  data= LCA,
  nstates=52,
  initdata =LCA)
  fit.mod <- fit(model_definition)
lofihelsinki
  • 2,491
  • 2
  • 23
  • 35
G.Aurelien
  • 55
  • 9
  • Just yestedday I had the same error (running the `lm()` function). I ran a simpler model first by (temporarily) excluding all factor variables (one by one) from my dataset. The I gradually added back the colums removed (one-by-one) until I saw which caused the error. – knb Dec 17 '18 at 11:39

2 Answers2

0

Latent class analysis should technically only be used for categorical observed variables, it should not be used for continuous variables. That's why your model is not converging, especially if your continuous variables has many variations. For your continuous variables, you should try dichotomizing them if you can. In other words, you should reduce the variation in them. Then running your model again.

Also, you should try running a model with less number of observed variables (less than 10). Then, as you reach convergence, slowly add more varaibles and keep an eye on your minimum BIC.

If you want to keep your continuous variables as it is, you can try Latent Profile Analysis, it allows for both continuous and categorical variables. Also, although some people might argue with this, I have seen Structural Equation modelling using ordinal variables and continuous variables in the same model, that would allow you to keep your continuous variables.

AnnieG
  • 11
  • 2
0

R is not the best software for latent class analysis. I would recommend using one of the (paid) alternatives: Latent Gold or Mplus.

They both have extensions where you can combine both continuous and categorical data for latent class analysis. I know it might be quite expensive, but they are much faster and much more flexible than any of the R's packages at the moment.

MartinBL
  • 56
  • 2
  • I was able to run a latent class model in R with both categorical and continuous variables. R has a package called multimix that allow you to do that @MartinBL – G.Aurelien Feb 27 '19 at 21:46