Questions tagged [chi-squared]

Anything related to chi-squared probability distribution or chi-squared statistical test (typically of distribution, independence, or goodness of fit).

In probability theory and statistics, the chi-squared (X²) distribution with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. It is one of the most widely used probability distributions in inferential statistics (for example, in hypothesis testing or in construction of confidence intervals).

See also on Wikipedia:

Tag usage

Questions on tag should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

643 questions
2
votes
1 answer

How to use sklearn ( chi-square or ANOVA) to removes redundant features

Under feature selection step we want to identify relevant features and remove redundant features. From my understanding redundant features are depended features. (so we want to leave only independent features between features to them self) My…
2
votes
1 answer

P-value from Chi sq test using Scipy

I am computing a test statistic that is distributed as a chi square with 1 degree of freedom. I am also computing P-value corresponding to this using two different techniques from scipy.stats. I have observations and expected values as numpy…
Siddharth Satpathy
  • 2,737
  • 4
  • 29
  • 52
2
votes
1 answer

Excel statisticals: How to calculate p-value of a 2x2 contingency table?

Given data such as: A B C 1 Group 1 Group 2 2 Property 1 56 651 3 Property 2 97 1,380 how can one calculate the p-value (i.e., the "right-tail" probability of the chi-squared…
NewSites
  • 1,402
  • 2
  • 11
  • 26
2
votes
2 answers

chisquare test in r that keeps row names

I'm building an employee survey with two waves, and I want to make sure that each wave is balanced in terms of some demographic variables, such as ethnicity and gender. Here is a fictitious sample of the data: library(tidyverse) sample_data <-…
J.Sabree
  • 2,280
  • 19
  • 48
2
votes
0 answers

ValueError: Input X must be non-negative in python

I am trying to apply feature selection method using SelectKBest with chi2 to select top 15 features from the data but it comes with an error ValueError: Input X must be non-negative Can any one help me out with this error. Below is the code that…
Shah
  • 21
  • 1
  • 6
2
votes
0 answers

Calculate 2 (define a function for 2 where the model is a line)

I have to calculate chi-square for given data , which contain three variable: years,fraction lost and error. I have used this formula where y = fraction lost , x = years and sigma is the error. a and b constants. fun.to.minimize <-…
2
votes
1 answer

SparkException: Chi-square test expect factors

I have a dataset containing 42 features and 1 label. I want to apply the selection method chi square selector of the library spark ML before executing Decision tree for the detection of anomaly but I meet this error during the applciation of chi…
Med Othman
  • 21
  • 2
2
votes
1 answer

Table of differences between observed and expected counts

I have data where I'm modeling a binary dependent variable. There are 5 other categorical predictor variables and I have the chi-square test for independence for each of them, vs. the dependent variable. All came up with very low p-values. Now,…
2
votes
1 answer

How to define a function in dplyr? - Adding the results of a chi-squared test

I am trying to write a function to give me a pivot table for two variables. Expanding my question here, I would like to include the p-value of a chi-square test for the relationship between the predictor and the target as well. How should I change…
Hamideh
  • 665
  • 2
  • 8
  • 20
2
votes
1 answer

Error "all entries of 'x' must be nonnegative and finite"

I'd like to find out if there's a statistically significant difference in the conversion rate between two categories. My data looks like this operating_system converted 1 Mac FALSE 2 Mac FALSE 3 Windows …
Cauder
  • 2,157
  • 4
  • 30
  • 69
2
votes
1 answer

Replication of Excel chisq.test function in Tableau with R

I'm struggling with chisq.test in Tableau via R. I have a model in Excel which I have to replicate, but my results differ. I think the problem is in correct massaging of the data in R code... These are the Excel values: p-value formula in Excel is:…
adim
  • 129
  • 1
  • 9
2
votes
1 answer

Proper proportion table and chisq.test() in R

I am still very new to R (Version 1.1.383 on Mac), so I apologize if this question is basic. I have searched for a solution on the internet and on this page but could not find any. What I want to do is to make a proportion table on which I can use…
2
votes
0 answers

Categorizing the data using pandas

enter image description hereI am trying to run a chi square test on a dataset and for that I need to use pd.cut() to formulate categories in the data set. However, I am getting this error ufunc 'subtract' did not contain a loop with signature…
2
votes
5 answers

R - post hoc chi squared - "fifer" package no longer supported?

I'm trying to use the package fifer with command install.packages("fifer") with R 3.5.0. However, R tells me that it is not available. This webpage tells me that it has been removed from CRAN. Alternatively is there another function / package in R…
ecjb
  • 5,169
  • 12
  • 43
  • 79
2
votes
0 answers

Calculate p-Value of ChiSquare in FileMaker

Can you help me to find a formula for p-Value of ChiSquare in FileMaker? I'm not mathematical if someone can explain me step by step what I have to do I'd really appreciate it. My formula for ChiSquare is: Roudend ( Let ( [ A = E6 ; B = F6 ; C =…
armamichi
  • 21
  • 1
  • 4