-1

Trying to use BPSO for feature selection but getting a Type error: TypeError: '(slice(None, None, None), array([ True, False, False, True, False, False, False, True, True, False, True, False, False, False, False, True, False, True, True, True]))' is an invalid key.

from sklearn import linear_model
classifier = linear_model.LogisticRegression()
def f_per_particle(m, alpha):
    total_features = 20
    if np.count_nonzero(m) == 0:
        X_subset = X
    else:
        X_subset = X[:,m==1]
    classifier.fit(X_subset, y)
    P = (classifier.predict(X_subset) == y).mean()
    # Compute for the objective function
    j = (alpha * (1.0 - P)
        + (1.0 - alpha) * (1 - (X_subset.shape[1] / total_features)))
    return j
def f(x, alpha=0.88):
    n_particles = x.shape[0]
    j = [f_per_particle(x[i], alpha) for i in range(n_particles)]
    return np.array(j)    
options = {'c1': 0.5, 'c2': 0.5, 'w':0.9, 'k': 30, 'p':2}
    dimensions = 20 # dimensions should be the number of features
    optimizer.reset()
    optimizer = ps.discrete.BinaryPSO(n_particles=30, dimensions=dimensions, options=options)
    cost, pos = optimizer.optimize(f, iters=1000)

I used the "bank-additional-full data set" and did some changes like cleaning the data or encoding the data for categorical fields.

r_selcuk_r
  • 1
  • 1
  • 2

1 Answers1

4

You need to convert your dataframe to a numpy array after importing the dataframe using:


X = X.values
KoKo
  • 349
  • 5
  • 24