1

I'm working on an anamaly detection system and I would like to migrate a matplotlib chart to a hvplot one to allow me to hover itens and get the label name.

Someone to help in an equivalent hvplot code ?

Below the chart that I would like to have on hvplot.

enter image description here

And the code to create it:

enter image description here

enter image description here

enter image description here

# pca-one - inlier feature 1,  pca-two - inlier feature 2
inliers_pca_one = np.array(X['pca-one'][X['outlier'] == 0]).reshape(-1,1)
inliers_pca_two = np.array(X['pca-two'][X['outlier'] == 0]).reshape(-1,1)
    
# pca-one - outlier feature 1, pca-two - outlier feature 2
outliers_pca_one = X['pca-one'][X['outlier'] == 1].values.reshape(-1,1)
outliers_pca_two = X['pca-two'][X['outlier'] == 1].values.reshape(-1,1)

plt.figure(figsize=(8, 8))

xx , yy = np.meshgrid(np.linspace(0, 1, 100), np.linspace(0, 1, 100))

# Use threshold value to consider a datapoint inlier or outlier
# threshold = stats.scoreatpercentile(scores_pred,100 * outliers_fraction)
threshold = percentile(scores_pred, 100 * outliers_fraction)
        
# decision function calculates the raw anomaly score for every point
Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()]) * -1
Z = Z.reshape(xx.shape)

# fill blue map colormap from minimum anomaly score to threshold value
plt.contourf(xx, yy, Z, levels=np.linspace(Z.min(), threshold, 7),cmap=plt.cm.Blues_r)
        
# draw red contour line where anomaly score is equal to thresold
a = plt.contour(xx, yy, Z, levels=[threshold],linewidths=2, colors='red')
        
# fill orange contour lines where range of anomaly score is from threshold to maximum anomaly score
plt.contourf(xx, yy, Z, levels=[threshold, Z.max()],colors='orange')
b = plt.scatter(inliers_pca_one, inliers_pca_two, c='white',s=20, edgecolor='k')
    
c = plt.scatter(outliers_pca_one, outliers_pca_two, c='black',s=20, edgecolor='k')
       
plt.axis('tight')   
plt.legend([a.collections[0], b,c], ['learned decision function', 'inliers','outliers'],
           prop=matplotlib.font_manager.FontProperties(size=20),loc='lower right')
      
plt.xlim((0, 1))
plt.ylim((0, 1))
plt.title('Cluster-based Local Outlier Factor (CBLOF)')
plt.show();

Kleyson Rios
  • 2,597
  • 5
  • 40
  • 65
  • hvPlot implements the same API as Pandas .plot() API, which is based on Matplotlib but doesn't use Matplotlib's native API. As a rule of thumb, if you can create it using pandas .plot() alone, without any Matplotlib-native API, it should be easy enough to make it using .hvplot() instead. Here it looks like it would be difficult to express this using .plot() alone, so you should expect it to be more difficult than simply using .hvplot(). I think the key here is to see it as an overlay of 3 or 4 plots, then use .hvplot() to get each separate plot p1, p2, p3, etc., then overlay them as p1*p2*p3. – James A. Bednar Feb 09 '21 at 20:55
  • Specifically, use .hvplot.scatter either once (with a `by=` to cover the in and outlying points) or twice (once for the inliers and once for the outliers), then overlay that on a .hvplot.contour and overlay the decision boundary on top of all that. – James A. Bednar Feb 09 '21 at 20:57

0 Answers0