0

I am using anaconda environment , on windows with pycaret installed,and pycharm. i want to run a basic toy example with pycaret (not using freely available datasets), as a simple y=mx+c, where x is 1-d

here is my working code with scikit.

import numpy as np
from sklearn.linear_model import LinearRegression
import pandas as pd
x= np.arange(0,1000,dtype = 'float64')
Y = (x*2) + 1
X = x.reshape(-1,1)
reg = LinearRegression().fit(X, Y)
# if predicting on same model,perfect score
score = reg.score(X,Y)
print('1- RSS/TSS: 1 for perfect regression=' + str(score))
print('coef =' + str(reg.coef_[0]))  # slope
print('intercept =' + str(reg.intercept_))  # intercept

this gives expected results as below:

enter image description here

Now,I create Dataframe that i can pass to pycaret pacakge.

data1 = np.vstack((x,Y)).transpose()
# create dataframe as required by Pandas
N= data1.shape[0]
# add first row
dat2 = np.array(['','Col1','Col2'])
for i in range(N):
    dat_row = list(data1[i,:].flatten())
    nm = ['row'+ str(i)]
    dat_row = nm + dat_row
    dat2 = np.vstack ((dat2, dat_row) )

df= pd.DataFrame(data=dat2[1:,1:],
                  index=dat2[1:,0],
                  columns=dat2[0,1:])
print(df)
print('***************************')
columns = df.applymap(np.isreal).all()
print(columns)
print('***************************')
# now, using Pycaret
from pycaret.regression import *
exp_reg = setup(df, html= False,target='Col2')
print('********************************')
compare_models()

when i do so, the numeric columns i created (x,y) are shown as categorical. This also recognized by pyCaret as Categorical.see the figure below. Why is this Categorical? Can i change it to be treated as numeric? enter image description here

Once I press enter, finally, Pycaret gives me the error below:

any ideas for this error?
sedy

enter image description here

korakot
  • 37,818
  • 16
  • 123
  • 144
user915783
  • 689
  • 1
  • 9
  • 27

1 Answers1

0

You can force the data type in PyCaret within setup function by using numeric_features and categorical_features param within the setup function.

For example:

clf1 = setup(data, target = 'target', numeric_features = ['X1', 'X2'])
Dharman
  • 30,962
  • 25
  • 85
  • 135
PyCaret
  • 94
  • 2