Questions tagged [contingency]

A contingency table is a non-negative integer matrix with specified row and column sums.

A contingency table is a non-negative integer matrix with specified row and column sums, so named by Karl Pearson in developing statistical tests of significance. Observations are counted in a table with appropriate row and column labels, whereby statistical tests may be done on the entries to determine how likely the results would arise if the row and column outcomes were independent events.

Given specified row and column sums, counting the number of possible contingency tables can be a hard problem. Indeed even the case of $2$ rows and $n$ columns is known to be #P-complete.

However existence of solutions, unless otherwise constrained, is easy: it is necessary and sufficient that the row sums and column sums give equal totals for the entries of the entire matrix (balance condition).

An example of a further constraint would be requiring 0/1 entries, called binary contingency tables. Necessary and sufficient criteria for these restricted solutions were given by Gale and Ryser (independently) in 1957.

242 questions
2
votes
2 answers

Python/ Pandas: Making a contingency table with multiple variables

My dataframe has 4 columns (one dependent variable and 3 independent). Here's a sample: My desired output is a contingency table, as follows: I can only seem to get a contingency table using one independent variable- using the following code (my…
YoungboyVBA
  • 197
  • 7
2
votes
3 answers

I want to create a 2X2 table for Chisq test from multiple levels categorical dataset

I have a dataset of race and outcome either (Y,N) I want to tabulate a 2X2 table to run a chisq test for each race. Asian 584 24 Black 1721 56 Hispanic 2400 90 White 8164 289 Once I create a table 2X2 so the first row will…
CodeRCodeP
  • 73
  • 5
2
votes
1 answer

Make contingency table from rows in R

I have a covid data frame with 376 columns, 7 rows with covid infection numbers of 376 different days in 7 countries. I've matched them different severity categories and now I'm trying to make a contingency table containing the severity categories…
codingbudgie
  • 155
  • 1
  • 7
2
votes
2 answers

How to get pandas crosstab to sum up values for multiple columns?

Let's assume we have a table like: id chr val1 val2 ... A 2 10 ... B 4 20 ... A 3 30 ...and we'd like to have a contingency table like this (grouped by chr, thus using 'A' and 'B' as the…
daniel451
  • 10,626
  • 19
  • 67
  • 125
2
votes
4 answers

Table in r to be weighted

I'm trying to run a crosstab/contingency table, but need it weighted by a weighting variable. Here is some sample data. set.seed(123) sex <- sample(c("Male", "Female"), 100, replace = TRUE) age <- sample(c("0-15", "16-29", "30-44", "45+"), 100,…
H.Cheung
  • 855
  • 5
  • 12
2
votes
3 answers

Sampling from a contingency table

I've managed as far as the code below in writing a function to sample from a contingency table - proportional to the frequencies in the cells. It uses expand.grid and then table to get back to the original size table. Which works fine as long as…
maja zaloznik
  • 660
  • 9
  • 24
2
votes
2 answers

how to create a contingency table for each row of a data frame

I have a large data frame with rows as species and counts from 2 years as columns. I want to create a contingency table for each row in order to test if there was a significant change (decrease) from the first to the second year. Here is similar…
KNN
  • 459
  • 4
  • 19
2
votes
2 answers

Flip order Columns / Rows in a table

I'm using the epiR package as it does nice 2 by 2 contingency tables with odds ratios, and population attributable fractions. As is common my data is coded 0 = No 1 = Yes So when I do tabele(var_1,var_2) The output comes out as a table aligned…
mmarks
  • 1,119
  • 7
  • 14
2
votes
1 answer

Frequency table by row in R

This may be a beginner question but I can't find an answer on the webs... maybe because I'm not good at describing the problem. I want to create a frequency table of nominal data separate by rows. For example, in this matrix: x <- matrix(nrow = 3,…
randmlaber
  • 109
  • 1
  • 1
  • 6
2
votes
2 answers

Combining crosstab-pivot-groupby for Pandas dataframe

I think this is a pretty simple question but I can't find another entry in which a similar case is solved. I have a Pandas dataframe that looks like this: group1 group2 meandiff lower upper reject 0 bacc …
2
votes
0 answers

Contingency table error

I'm tryin to make a contingency table on R using the Melanoma data set from MASS package using Rcmdr. local({ .Table <- xtabs(~sex+status+ulcer, data=Melanoma) cat("\nFrequency table:\n") print(.Table) }) Every time i try to make a…
Giuseppe Minardi
  • 411
  • 4
  • 16
2
votes
3 answers

Summing Contingency Tables in R with first column as character

My sales dataset includes 3 columns: Countries, Sales Type/Method, Total Quarterly Revenue. Here's a display of the first few rows for a better idea: Retailer.country Order.method.type Qtr.Rev 1 …
RVD
  • 100
  • 1
  • 9
2
votes
3 answers

How to convert data frame to contingency table in R?

I have a simple question. How to convert a data frame into a contingency table for Fisher's Exact Test? I have data having about 19000 rows: head(data) R_T1 R_T2 NR_T1 NR_T2 GMNN 14 60 70 157 GORASP2 7 67 …
kin182
  • 393
  • 6
  • 13
2
votes
2 answers

How to write a function that transform a dataframe to another dataframe?

Suppose I have a data frame in the following form: N1 N2 N3 N4 N5 N6 1 0 0 1 0 0 0 1 0 1 0 1 1 1 1 0 0 1 0 0 0 1 1 0 1 1 0 0 0 1 I would like to write a function…
mackbox
  • 199
  • 5
2
votes
2 answers

r2dtable contingency tables are too concentrated

I am using R's r2dtable function to generate contingency tables with given marginals. However, when inspecting the resulting tables values look somewhat too concentrated to the midpoints. Example: set.seed(1) matrices <- r2dtable(1e4, c(100, 100),…
paljenczy
  • 4,779
  • 8
  • 33
  • 46
1 2
3
16 17