1

(Moved to https://stats.stackexchange.com/questions/554136/seaborn-lmplot-regplot-no-fit-for-logistic-all-nan-slice-encountered)

I am trying to create a regression plot for some data using lmplot of Seaborn. However, Seaborn does not plot the regression line, only the points are plotted and I get an error which I don't understand:

~/.local/lib/python3.9/site-packages/statsmodels/genmod/families/links.py:187: RuntimeWarning: overflow encountered in exp t = np.exp(-z) ~/.local/lib/python3.9/site-packages/numpy/lib/nanfunctions.py:1395: RuntimeWarning: All-NaN slice encountered result = np.apply_along_axis(_nanquantile_1d, axis, a, q,

While using the 'tips'-dataset I found out that this errors seems to appear if both sets of the data are perfectly separated (See image below). (I know separating the same value you are plotting the logistic regression against makes no sense, but it produces a dataset similar to the one I have)

Here is some code to reproduce, modified from https://seaborn.pydata.org/generated/seaborn.regplot.html

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

tips = sns.load_dataset("tips")

tips["big_tip"] = tips.total_bill > 20
tips = tips.head(30)

sns.lmplot(x='total_bill',y='big_tip',data=tips, logistic=True)

plt.show()

and the resulting image, where I expected to see the regression line: enter image description here

Python 3.9.7 Seaborn 0.11.2 Numpy 1.21.1

JohanC
  • 71,591
  • 8
  • 33
  • 66
mnzl
  • 372
  • 3
  • 14
  • The logistics regression need `statsmodels`. Maybe there is a problem with that library? It could also be helpful to show the complete error trace to get a better idea where the error originates. – JohanC Nov 29 '21 at 15:18
  • The error trace is complete, I only changed my home folder name to ~ – mnzl Nov 29 '21 at 15:23
  • Interestingly, `tips["big_tip1"] = (tips.tip / tips.total_bill) > .175` works and `tips["big_tip2"] = tips.total_bill > 20` does not – Trenton McKinney Nov 29 '21 at 15:34
  • Yes, when using the other calculation (tip/bill > .175) both sets are overlapping. Using the later (>20) they are separated perfectly. This seems to be causing an issue – mnzl Nov 29 '21 at 15:45
  • so this is seems like a logistic regression question, which might be better suited to https://stats.stackexchange.com/ – Trenton McKinney Nov 29 '21 at 15:48
  • Copied it to SE: https://stats.stackexchange.com/questions/554136/seaborn-lmplot-regplot-no-fit-for-logistic-all-nan-slice-encountered – mnzl Nov 29 '21 at 15:56
  • @JohanC Removing the 'head' does not change anything, it then uses 244 values and still gives the same error+ result – mnzl Nov 29 '21 at 15:58
  • 1
    See also [PerfectSeparationError: Perfect separation detected, results not available](https://stackoverflow.com/questions/53041669/error-perfectseparationerror-perfect-separation-detected-results-not-availab) and [Getting "Perfect separation detected, results not available" while building the Logistic Regression model](https://stackoverflow.com/questions/61173135/getting-perfect-separation-detected-results-not-available-while-building-the) – JohanC Nov 29 '21 at 23:43
  • 1
    Yes almost certainly some bootstrap samples are perfectly separated. You can turn bootstrapping off (`ci=None`), although then you won't get the error bars on the regression line. – mwaskom Nov 30 '21 at 00:09

0 Answers0