Questions tagged [cumulative-frequency]

90 questions
0
votes
1 answer

Plotting a cumulative histogram with exported data in Python

I am trying to plot a cumulative histogram similar to the one shown below. It shows the number of occurrences (y-axis) of the French pronoun “vous” in a text corpus (x-axis) represented from word 0 to 92,633. It’s been created using a corpus…
Clemclem
  • 81
  • 5
0
votes
1 answer

Combine ECDF plot with histogram in secondary axis ggplot

I have one income variable. I want to make a combination plot of a histogram and cumulative distribution in one plot with two y-axes. I got this code, income<- bi_tr%>% ggplot(aes(x=`12 Income`,na.rm = TRUE))+ #this fill comment goes to define…
0
votes
1 answer

How to create a grouped cumulative frequency graph with ggplot2

I'm working with a dataset of elemental concentrations, and I want to compare the cumulative frequency graphs of elemental concentrations in two places, like I did using plot() in this image, but with ggplot. Here is a dummy…
Antón
  • 112
  • 1
  • 8
0
votes
2 answers

How to generate empirical c.d.f from a set of observations?

Assume that I have a vector x = c(1, 1, 3, 0, 4, 5, 4). I would like to ask if there is a function to generate the cdf of this data. My desire result is x c.d.f 1 0 1/7 2 1 3/7 3 3 …
Akira
  • 2,594
  • 3
  • 20
  • 45
0
votes
1 answer

Cumulative histogram with bins in frequency python

I am looking for a python function to get a cumulative curve of frequency with regularly spaced frequence (y axis) and not values (x axis). On this image, the sampling of the dots is regularly spaced for x axis, I would like it to be regular for y…
0
votes
1 answer

PowerBI - Running Total on Time-Independent Data Column

I was attempting to employ the formulas here to calculate a running total of a column in PowerBI. However, my data is time-independent. In addition, every other running total calculation I've seen for PowerBI has been in reference to a date field. …
Michael James
  • 492
  • 1
  • 6
  • 19
0
votes
2 answers

Cumulative count for calculating daily frequency using SQL query (in Amazon Redshift)

I have a dataset contains 'UI' (unique id), time, frequency (frequency for give value in UI column), as it is shown here: What I would like to add a new column named 'daily_frequency' which simply counts each unique value in UI column for a given…
0
votes
1 answer

Finding cumulative features in dataframe?

I have a datframe with around 200 features and 3000 rows. These data samples are logged in different time, basically one per month, as shown in the below example in “col101”: 0 col1 (id) col2. col3 …. col100 col101 (date) … …
0
votes
1 answer

Visualisation of missing-data occurrence frequency by using seaborn

I'd like to create a 24x20 matrix(8 sections each has 60 cells or 6x10) for visualization of frequency of missing-data occurrence through cycles (=each 480-values) in dataset via panda dataframe and plot it for each columns 'A','B','C'. So far I…
Mario
  • 1,631
  • 2
  • 21
  • 51
0
votes
1 answer

Finding the midpoint of three years

I have a dataset which represents the volume of sales over three years: data test; input one two three average; datalines; 10 20 30 . 20 30 40 . 10 30 50 . 10 10 10 . ; run; I'm looking for a way to find the middle point of the three years, the…
78282219
  • 593
  • 5
  • 21
0
votes
1 answer

Resetting Cumulative Figures when a new set of data appears

I have this table (minus the cuml column): ¦ Name ¦¦ website ¦¦ page ¦¦fruit type¦¦year week¦¦platform¦¦totalviews¦¦cuml¦ ¦avocado ¦¦avocado.com¦¦aboutpage¦¦ sugar ¦¦ 2001-08 ¦¦ mobile ¦¦ 18 ¦¦ 18 ¦ ¦avocado ¦¦avocado.com¦¦homepage…
VS1SQL
  • 135
  • 2
  • 13
0
votes
1 answer

ggplot cumulative frequency with groups

I would like to construct cumulative count for two groups and reweight it to level 1. I know how to plot density in this case: my_df = data.frame(col_1 = sample(c(0,1), 1000, replace = TRUE), col_2 = sample(seq(1,100,by=1), 1000,…
user1700890
  • 7,144
  • 18
  • 87
  • 183
0
votes
0 answers

Visualizing difference between two distributions

I am trying to visualize the differences between two distributions (preferably using python). I've plotted the cumulative frequency distributions as well as kernel density estimates: kde, cumulative frequency But, my audience is not used to looking…
0
votes
2 answers

Cumulative frequency for string occurence

To start off, a little about my problem. I have a data frame of winners of the champions league cup indexed by years. Like this, note team names are strings. year team need this year team wins to date 1 team1 …
Frank Lee
  • 23
  • 3
0
votes
1 answer

How to find cumulative frequency without group by in pyspark dataframe

I have a count column in pyspark dataframe as : id Count Percent a 3 50 b 3 50 I want a result dataframe as : id Count Percent CCount CPercent a 3 50 3 50 b 3 50 6 …