0

I have a dataframe df which has spotify data features. When I run the model using RandomForestClassifier I get the feature important plot but when I run RandomForestRegressor I get only a bar against the popularity. Can someone help?

from yellowbrick.model_selection import FeatureImportances

# Load the classification data set
X = df[features]
y = df.popularity

train_X, test_X, train_y, test_y = train_test_split(X,y, test_size= 0.1, random_state=38)

model = RandomForestClassifier(n_estimators=10)
# model = RandomForestRegressor(n_estimators=10)

viz = FeatureImportances(model)
viz.fit(X, y)
viz.show()
unaied
  • 197
  • 11

1 Answers1

0

I repeated the above experiment with the spotify dataset, however I was able to use the RandomForestRegressor with Yellowbrick's FeatureImportances Visualizer (see image below). I suggest that you update yellowbrick to the latest version that was recently released on Feb 9th. pip install -U yellowbrick

from yellowbrick.model_selection import FeatureImportances
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor

# Load spotify Data Set
df = pd.read_csv('data.csv.zip')

df = df[['acousticness', 'danceability', 'duration_ms', 'energy',
         'explicit', 'instrumentalness', 'liveness', 'loudness',
         'popularity','speechiness', 'tempo']]

X = df.drop('popularity', axis=1)
y = df.popularity

train_X, test_X, train_y, test_y = train_test_split(X,y, test_size= 0.1, random_state=38)

#model = RandomForestClassifier(n_estimators=10)
model = RandomForestRegressor(n_estimators=10)

viz = FeatureImportances(model)
viz.fit(X, y)
viz.show()

enter image description here

Dharman
  • 30,962
  • 25
  • 85
  • 135
larrywgray
  • 993
  • 1
  • 7
  • 14