0

I'm using a multiclass implementation of multiclass Adaboot by Xin Jin https://github.com/jinxin0924/multi-adaboost/tree/master I want to make use of the SAMME algorithm to solve my problem. Inside the method discrete_boost I want to estimate the error as the absolute difference between y_pred and y and divide between the number of classes to make a better classification in ordinal problems but I'm doing it wrong and I dont know why. Here is the method and the line Im changing

def discrete_boost(self, X, y, sample_weight):
        estimator = deepcopy(self.base_estimator_)
        if self.random_state_:
            estimator.set_params(random_state=1)

        estimator.fit(X, y, sample_weight=sample_weight)

        y_pred = estimator.predict(X)
        incorrect = y_pred != y
        #estimator_error = np.dot(incorrect, sample_weight) / np.sum(sample_weight, axis=0) ORIGINAL
        estimator_error = np.sum(np.abs(y_pred - y))/(self.n_classes_ - 1) # WHAT I WANT TO BE THE ERROR

        # if worse than random guess, stop boosting
        if estimator_error >= 1 - 1 / self.n_classes_:
            return None, None, None

        # update estimator_weight
        estimator_weight = self.learning_rate_ * np.log((1 - estimator_error) / estimator_error) + np.log(
            self.n_classes_ - 1)

        if estimator_weight <= 0:
            return None, None, None

        # update sample weight
        sample_weight *= np.exp(estimator_weight * incorrect)

        sample_weight_sum = np.sum(sample_weight, axis=0)
        if sample_weight_sum <= 0:
            return None, None, None

        # normalize sample weight
        sample_weight /= sample_weight_sum

        # append the estimator
        self.estimators_.append(estimator)

        return sample_weight, estimator_weight, estimator_error

estimator_error = np.sum(np.abs(y_pred - y))/(self.n_classes_ - 1) but it dont works it gives the error axis 1 is out of bounds for array of dimension 1.

Im testing it in a non ordinal data tes just to see it works

X, y = make_gaussian_quantiles(n_samples=13000, n_features=10,
                               n_classes=3, random_state=1)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
desertnaut
  • 57,590
  • 26
  • 140
  • 166

0 Answers0