0

This might not be the right place to ask but I'm not sure where else to ask it. I'm trying to use the smbinning package. In particular, I'm trying to bin by multiple predictor variables. The issue is all the examples in the package documentation only deal with one predictor variable. I tried this naively:

result=smbinning(df=training,y="FlagGB",x=".,",p=.05)

which seemed to execute okay, but then if I tried to run result$ivtable I got the error

Error in result$ivtable : $ operator is invalid for atomic vectors

Does anyone know a) how to get smbinning to accept multiple predictors or if it can't another package that can; b) how to resolve the specific error listed above?

114
  • 876
  • 3
  • 25
  • 51

4 Answers4

1

I have solved the problem ,It is because the training may not a data frame, you have to convert training into data frame with as.data.frame(training). you can see the smbinning code (https://github.com/cran/smbinning/blob/master/R/smbinning.R#L490), there is this block

i=which(names(df)==y) # Find Column for dependant

j=which(names(df)==x) # Find Column for independant

if (!is.numeric(df[,i]))

{ 
    return("Target (y) not found or it is not numeric")
} 

secondly,the y FlagGB must be numerical ,if your y varible is factor ,you have to convert to numerical ,you can use as.numeric(as.character(y)) not directly use as.numerical() the problem is similarly to "Target (y) not found or it is not numeric" -Package smbinning - R

0

Have you looked into "Information" package? It seems to be doing the job, but there is no facility to recode the variable. Of if there is one, I haven't been able to find. Otherwise, it is a really great package for exploration and analysis of the variables.

Ritesh
  • 21
  • 1
  • 3
0

To answer b) you should do: result and (most probably) see that the function in fact did not execute for the specific reason that you will get in return.

Indeed, it is a bit confusing that the smbinning package returns its errors silently and within the variable itself.

Question a), on the other hand, is hard to answer without looking at the data. You can try to cross/multiply your variables, but that may result in a very large number of factor levels. I would suggest that you apply the smbinnign package to group each of your characteristics into a few groups and then try to cross the groups.

Borislav Aymaliev
  • 803
  • 2
  • 9
  • 20
0

for question a), you should use sumiv method which can calculates IV for all variables in one step. code like:

sumivt=smbinning.sumiv(chileancredit.train,y="FlagGB")

sumivt # Display table with IV by characteristic

dennis ding
  • 291
  • 1
  • 3
  • 4