Questions tagged [pearson]

in statistics, Pearson's r, the Pearson product moment correlation coefficient, shows the extent of a linear relationship between two data sets on a scale from -1 to 1.

Overview

Pearson product-moment correlation coefficient is given by the following equation:

enter image description here

where,

pXY = Pearson’s correlation coefficient;
Cov(X,Y) = covariance of random variables X and Y;
Var(X) = variance of random variable X;
Var(Y) = variance of random variable Y;


Tag usage

Questions on tag should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

155 questions
0
votes
1 answer

pearson correlation for genes in gene expression data

I have two datasets: one is actual count and other one is predicted counts. I want to do a pearson correlation between them. My actual count data look like this: My predicted counts data look like this: I want to do pearson correlation for these…
Rhea Bedi
  • 123
  • 6
0
votes
1 answer

Problem with creating Pearson correlation coefficient in python

Problem: Creation of Pearson correlation coeffizient dependant on values of third column. To start with, I have a dataframe with 3 columns. A, B and C Col. A and B contain float64 type whereas in C there are objects. I want to get the Pearson…
NND
  • 23
  • 4
0
votes
0 answers

NAs in cor() function

I'm trying to calculate the Pearson correlation for my data. So far I successfuly did it for daily and hourly returns, however I'm struggeling with Minuten returns. This is what I did: cor_df_min <- merge(usdt_min_return_cor, pax_min_return_cor,…
Nina
  • 11
  • 2
0
votes
1 answer

Adapt method cor.test for each data frame in a list

I want to adapt the method in cor.test in R for each data frame in a list of data frames. data(iris) iris.lst <- split(iris[, 1:2], iris$Species) options(scipen=999) normality1 <- lapply(iris.lst, function(x) shapiro.test(x[,1])) p1 <-…
Nadiine El Nino
  • 339
  • 1
  • 6
0
votes
1 answer

Most efficient way to calculate correlation matrix in python

I need to calculate the sales correlation of 5000 products which will results in 5000 by 5000 correlation matrix. I am trying to accomplish this in pandas using df.corr() but it is causing memory issues. Any ideas of more efficient ways to achieve…
Eric
  • 1
  • 2
0
votes
1 answer

"contrib" scale in PCA plot

What does the contrib scale indicate in PCA plot?
karadeniz
  • 5
  • 2
0
votes
1 answer

Looking for efficient way to get pearsonr between two pandas columns

I am trying to find a way to get the person correlation and p-value between two columns in a dataframe when a third column meets certain conditions. df…
jhaeckl
  • 1
  • 1
0
votes
1 answer

Neo4j Collaborative Filtering (CF) recommendation query using Pearson

Hi everyone at Stackoverflow, I want to understand query that is using Pearson. What can be nom and denom? What is r1: r1 and r2: r2? And I don't understand what is r.r1.rating and r.r2.rating. This query should be recommending Movies that are rated…
Anna
  • 1
  • 1
  • 4
0
votes
1 answer

Calculate Pearson's Coefficient for Multidimensional features

I have a pandas dataframe where each row corresponds to one sample and each column represents one feature. Now one of my columns is a string column which contains text like "This is a red apple". How can I convert this to a form that pearson's…
newbie
  • 3
  • 1
0
votes
0 answers

Unexpected clusterings using same distance (pearson)

I'm playing with some dummy data to test clustering based in correlation distance (pearson). I'm calculating pearson correlation two ways, one inside the pheatmap function and other outside using cor(), both ways I would expect to retrieve same…
HeyHoLetsGo
  • 137
  • 1
  • 14
0
votes
1 answer

Pearson correlation and p-value in columns from a data frame

For example, if we calculate Pearson correlation and P-value of first two variables of data set mtcars, results are something like this: Correlation value: mpg disp mpg 1.00 -0.85 disp -0.85 1.00 P-value: mpg disp mpg…
Khashi
  • 47
  • 1
  • 7
0
votes
0 answers

Check origin of correlation between categorical variables

I have two variables, status (tells if the patient is new, recidivist...) and result. There are 5 types of status and 10 types of results. I already did the Chi-squared test and Cramer's V test and the two variables are dependent. However, I want…
Rodf
  • 11
  • 1
0
votes
1 answer

(Pearson's) Correlation loop through the data frame

I have a data frame with 159 obs and 27 variables, and I want to correlate all 159 obs from column 4 (variable 4) with each one of the following columns (variables), this is, correlate column 4 with 5, then column 4 with 6 and so on... I've been…
0
votes
1 answer

Pearson Correlation problem

I'm not sure which figures to use below in a problem Im trying to solve that involves using the Pearson Correlation formula. A B C D E F Bob 4 5 4 2 Fra 2 2 2 3 2 Lee 2 4 3 5 Cha 5 4 4 1 "Describe a…
0
votes
1 answer

Calculate the pearson coefficient between two lists

I have two different lists (a and b) containing 626257 vectors, each vector containing 44 numeric entries. One list contains sample data and the other list serves as a reference. Now I want to calculate the pearson correlation between all the…
stefx
  • 25
  • 10