Questions tagged [pearson]

in statistics, Pearson's r, the Pearson product moment correlation coefficient, shows the extent of a linear relationship between two data sets on a scale from -1 to 1.

Overview

Pearson product-moment correlation coefficient is given by the following equation:

enter image description here

where,

pXY = Pearson’s correlation coefficient;
Cov(X,Y) = covariance of random variables X and Y;
Var(X) = variance of random variable X;
Var(Y) = variance of random variable Y;


Tag usage

Questions on tag should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

155 questions
1
vote
1 answer

How to plot a Pearson correlation given a time series?

I am using the code in this website http://blog.chrislowis.co.uk/2008/11/24/ruby-gsl-pearson.html to implement a Pearson Correlation given two time series data like so: require 'gsl' pearson_correlation = GSL::Stats::correlation( …
jdkealy
  • 4,807
  • 6
  • 34
  • 56
0
votes
1 answer

What is the motivation behind Pearson's coeficient in Apache Mahout

This question is with respect to the Recommendation part of Apache Mahout using Pearson's coefficient for measuring similarity between users. According to my understanding, here is how Pearson's coefficient measures similarity between users. Let's…
London guy
  • 27,522
  • 44
  • 121
  • 179
0
votes
1 answer

Finding dissimilar dimensions in a feature vector in Mahout

If I use a similarity based algorithm such as pearson correlation score to compare two feature vectors and I want to know those dimensions/feature fields which are very much dissimilar amongst the feature set then what is the algorithm to be used? I…
seahorse
  • 2,420
  • 4
  • 31
  • 40
0
votes
2 answers

Pearson Correlation without using zero element in Matlab

I have 2 example vector in Matlab : A = [5,3,3,0,4,1,5,0,2,5,5,0,5,3,4,0,1,4,4,0,4,2]; B = [1,0,0,0,1,0,4,0,0,0,0,4,4,0,1,0,0,0,0,0,0,0]; When, I try to calculate pearson correlation with manual method and do it with excel I have the same result…
0
votes
1 answer

Calculating Correlation between genes of different treatments

I have gene expression data in triplicates (four genes say g1,g2,g3,g4) in two conditions control and treatment. i would like to calculate correlation between genes of control and treatment. i have written code in R to calculate correlation and…
Retsi
  • 45
  • 3
0
votes
0 answers

R k-means cluster with pearson (ClusterR)

I'm trying to replicate a feature in Systat, which is k-means clustering with Pearson correlation.I've attempted to use the package called ClusterR as it allows me to modify some parameters. I have a file with 96 rows and five columns, one from…
Stef
  • 1
  • 2
0
votes
0 answers

Cross-lagged Pearson correlation in R

I have a dataset where I have two recordings (sessions) of two different variables. set.seed(123) data <- data.table( id = rep(1:20, each = 2), session = rep(1:2, times = 20), var1 = sample(1:100, size = 40, replace = TRUE, var2 =…
Inkling
  • 469
  • 1
  • 4
  • 19
0
votes
0 answers

Create a function to generate a set of Pearson Correlations for a data file without use of libraries

I am trying to develop a function which allows me to generate a Pearson Correlation Coefficient for every pair of columns in csv data set. The function needs to return: a list of tuples, each tuple containing two column names and then the Pearson…
d_allen
  • 13
  • 1
0
votes
0 answers

Pearson correlation function for multidimensional array

I want to compute the pearson correlation for an array of signals. So, for instance, if I have a list of 10 signals with 50 points and I want to compare it to my reference signal with also 50 points, how can I do that without using a for loop? It's…
0
votes
0 answers

How to compute the coefficient of determination with respect to the 1:1 line in Python?

I am working with time-series in which peaks can be observed that are (often) approximately Gaussian shaped. To each of these peaks, I fit a Gaussian curve and want to assess how well this Gaussian curve fits the actual data. For this assessment, I…
Misterrik
  • 17
  • 2
0
votes
0 answers

Standardized tails of pearson curves

I am working on process capability analysis for non normal data, and i need standardized tails of pearson curves which are used in clement's method to approximate the Upper and lower percentiles based on the values of kurtosis and…
0
votes
0 answers

Different Correlation Coefficent for Different Time Ranges

I built a DataFrame where there are the following data: Daily Price of Gas Future of N Day; Daily Price of Petroil Future of N Day; Daily Price of Dau-Ahead Eletricity Market in Italy; The data are taken from 2010 to 2022 time range, so 12…
0
votes
0 answers

Calculate pearson correlation coefficients

• There are 1000 variables in Problem dataSet a) Which two variables have the greatest negative Pearson correlation coefficient? b) Which two variables have the greatest positive Pearson correlation coefficient? c) Which two variables have the…
Anil Khatal
  • 9
  • 1
  • 4
0
votes
0 answers

Memory error calculating Pearson correlation on huge dask.dataframe

I have a huge dataset (9M rows * 125 cols) and I am trying to find a way to calculate the Pearson correlation for a variable vs rest. I receive memory error for this task. I tried dask as a solution as suggested in [enter link description here][1]…
Ali Alami
  • 1
  • 1
0
votes
0 answers

performance of calculating pearson coefficient of one vector with n vectors

I have one vector x, and n vectors Y (n>=10,000,000) Each vector is of size 4000 Now it needs to get corr(x, yi), obviously the result is of size n If each corr is calculated one by one, it takes a lot of time and can not be finished in 1 min.…
whogiawho
  • 466
  • 9
  • 21