0

enter image description hereI'm trying to visualize correlation between two columns in my dataset. I tried to use plot(), scatterplot, but the result is not a readable graph. For example I used this function:

scatter.smooth(x=Lifestyles$SLEEP_HOURS, y=Lifestyles$SUFFICIENT_INCOME, main="sleep hours and Income", xlab = "Sleep hours", ylab = "income, 1,2")

About dataset. I have about 12000 observations and 20 columns. both columns are as.numeric and integer.

here I'm trying to observe number of sleep hours and how many tasks completed daily

my link to my dataset: https://www.kaggle.com/ydalat/lifestyle-and-wellbeing-data

Thank you all in advance!

  • 1
    Your data is coded into integer values so the number of possible values is small and the data are plotting on top of each other. This may be because the data are encoded into categories. If that is so, the plot may not be very meaningful. Try looking at `table(Lifestyles$SLEEP_HOURS, Lifestyles$SUFFICIENT_INCOME)`. Or you can use the `jitter()` function to fuzz the data so they do not overplot. – dcarlson Jun 12 '20 at 04:14
  • Thank you for time and effort! I will give a try! – Nilyufar Babajanova Jun 12 '20 at 18:55

0 Answers0