Questions tagged [crosstab]

A cross tab, also known as a "pivot table", is a method of summarizing data in tabular form. Typically, these have a category column in the left hand side, and multiple summary columns to the right. Each different category will have its own row. Be sure to also include the relevant language tag (python, R, sql-server etc.) on your question.

Resources:

1549 questions
3
votes
2 answers

Create a matrix with lowest cell count for every pair of binary variables

I have a dataset with several binary variables (x1-x5, values: 1, 2, NA). My goal is to identify whether pairs of binary variables have zero or very low cell counts in the cross-tab table (after ignoring the missing values). So, I would like to…
Michael Matta
  • 394
  • 2
  • 16
3
votes
2 answers

How to crosstab a pandas dataframe when one variable (column) is a list of varying length

How can I generate a crossed table from the following dataframe: import pandas as pd dat = pd.read_csv('data.txt', sep=',') dat.head(6) Factor1 Factor2 0 A X 1 B X 2 A X|Y 3 B X|Y 4 A X|Y|Z 5 …
striatum
  • 1,428
  • 3
  • 14
  • 31
3
votes
2 answers

Clean alternative to pandas' broken crosstab for NA values

I'm trying to get a function that works similarly to R's table function, for which the parameter useNA allows me to include NA values in the cross table. Here's a small example: df = pd.DataFrame({"a": [0, 1, pd.NA, pd.NA], "b":[2, pd.NA, 3,…
Uretki
  • 197
  • 9
3
votes
2 answers

Is there a simple way to convert a pandas series to a crosstab of ratios for values in the series?

Input name score bob 2 fred 4 jim 1 anne 5 Desired Output (ratio of scores: e.g. bob*fred in row 1 = 2/4, etc.) name bob fred jim anne bob 1 0.5 2 0.4 fred 2 1 4 …
cgoldie
  • 123
  • 8
3
votes
1 answer

Unable to include column if all rows are zero in pandas crosstab

In pandas crosstab, now I am getting the output as below if the other col contains all zero value: 0 0 5 1 2 But I need to get an output for the other column even if it contains all zero. 0 1 0 5 0 1 2 0 I am using below code to create cross…
DOT
  • 309
  • 2
  • 11
3
votes
4 answers

How to summarize values across multiple columns?

I have a dataframe that looks like this: 1 2 3 4 5 A B A B A C B B B B A C A B B And I would like to summarize my data in a frequency table like this: A B C 1 2 0 1 2 0 2 1 3 2 1 0 4 0 3 0 5 1 2 0 How would…
Dieu94
  • 371
  • 1
  • 11
3
votes
1 answer

Centering data in crosstab column

This picture shows a sample of what my columns look like in my crosstab (I have two column groups - in this case a date and status). The date should be centered across the 5 columns and I have set the horizontal to be centered. However, it's only…
nevets1219
  • 7,692
  • 4
  • 32
  • 47
3
votes
2 answers

Collecting series from Pandas groupby object

I am working with a dataframe of Chess results like the following Opponent Date Time Result 0 Hikaru 2020.03.02 01:22:54 1 1 Hikaru 2020.03.02 01:22:58 0.5 2 Hikaru 2020.03.03 01:18:17 1 3 Hikaru 2020.03.03 …
Mobeus Zoom
  • 598
  • 5
  • 19
3
votes
1 answer

Set up loop to run several crosstabs simultaneously in R

I am trying to find the weighted cross tabulations of several columns. My actual data has over 500 columns so I am trying to automate the process as much as possible. This is a snippet of my data: data_in <- read_table2("Q50_1 Q50_2 Q38 Q90 pov…
NewBee
  • 990
  • 1
  • 7
  • 26
3
votes
1 answer

Display missing values with Crosstable()

I'm trying to display a frequency table using CrossTable(). I want it to display the same results as: table(mydata$union, exclude=NULL) I can't find any documentation on this, but all I have is: CrossTable(mydata$union,digits=4)
Samantha V
  • 31
  • 2
3
votes
0 answers

pyspark pivot alternative with better performance

Below is my input dataset : df = spark.createDataFrame([ \ ("0","CattyCat","B2K","B"), \ ("0","CattyCat","B3L","I"), \ ("0","CattyCat","B3U","I"), \ ("0","CattyCat","D3J","C"), \ ("0","CattyCat","J1N","H"), \ …
gbzygil
  • 141
  • 4
  • 16
3
votes
3 answers

Pivot table without crosstab/tablefunc

I have a table like this INPUT id author size file_ext -------------------------------- 1 a 13661 python 1 a 13513 cpp 1 a 1211 non-code 2 b 1019 python 2 b 6881 cpp 2…
3
votes
1 answer

Sorting and deleting values while using crosstab

So I have this assignment for a pandas course and I cannot wrap my head around how one would go about doing this correctly. The assignment gives me this huge data file with all kinds of columns displaying data, it looks like this: Document Year…
Yoeril
  • 83
  • 6
3
votes
1 answer

MySQL Pivot Table

Over the past few days, I have been trying to find an answer to this problem. While I have not found an answer, this site keep appearing in my search results so I thought I would give it a try. Awesome formatting options for posts BTW. I have a…
3
votes
1 answer

Transpose single row with multiple columns into multiple rows of two columns

I have a SELECT query that works perfectly fine and it returns a single row with multiple named columns: | registered | downloaded | subscribed | requested_invoice | paid | |------------|------------|------------|-------------------|------| | 9000 …
prm
  • 120
  • 1
  • 8