Questions tagged [bins]

235 questions
0
votes
0 answers

Adjusting histogram in matplotlib

I am making a histogram using matplotlib. I am using integer data and I want them to represent 1 bin. The following is the sample code which I made. import matplotlib.pyplot as…
Kyle
  • 80
  • 4
0
votes
1 answer

`ggplot2::geom_histogram`: Using the Sturges method without error

Base R hist() function uses the Sturges method to calculate the optimal number of bins, unlike ggplot2::geom_histogram. There is a short tutorial showing how to replicate the Sturges method using…
rempsyc
  • 785
  • 5
  • 24
0
votes
1 answer

how to bin multiple variables for scatterplot

It's hard to determine the relationship between these variables, so I'd like to bin them. I've found advice explaining how to bin two variables, but not seven. I'm also not sure how to tailor it to my dataset. Is there a way to alter this to bin…
bandcar
  • 649
  • 4
  • 11
0
votes
1 answer

Is there a way to put running cumulative data into discrete time bins by group?

Beginning with something like this: SUBJECT TIME C_INTAKE C02 0.00 0.0 C02 48.00 0.2 C02 49.31 0.4 C02 51.20 0.4 C02 52.61 0.4 C02 55.61 0.6 C02 82.77 0.9 C02 84.97 1.1 C02 86.39 …
yt3
  • 13
  • 2
0
votes
1 answer

How to do a histogram from 2 datasets (Bin problem)

I am trying to do a histogram like the one below but I am struggling with the bins. This is my code: plt.subplots(figsize=(2, 1), dpi=400) width = 0.005 plt.xticks(((density_1.index.unique()) | set(density_2.index.unique())), rotation=90,…
Ignacio Such
  • 129
  • 1
  • 8
0
votes
1 answer

Pandas Cut Categorical Treating Nan as Additional Max Bin

I have a Pandas dataframe where I'm running the max across two binned columns. I'm wanting max to treat nan (which I'm substituting to be 'NA') as the max possible bin. When re-categorizing the dataframe and adding this addtional bin, max isn't…
Shaun
  • 81
  • 5
0
votes
2 answers

How to define a function that will check any data frame for Age column and return bins?

I am trying to define a function that will take any dataframe with an 'Age' column, bin the ages, and return how many Xs are in each age category. Consider the following: def age_range(): x = input("Enter Dataframe Name: ") df = x …
Noob3000
  • 53
  • 5
0
votes
1 answer

What is an efficient way to calculate the mean of values in the bin with maximum frequency for large number of numpy arrays?

I am looking for an efficient way to do the following calculations on millions of arrays. For the values in each array, I want to calculate the mean of the values in the bin with most frequency as demonstrated below. Some of the arrays might contain…
Omid
  • 51
  • 3
0
votes
0 answers

How to use qcut in a dataframe with conditions value from columns

I have the following scenario in a sales dataframe, each row being a distinct sale: Category Product Purchase Value | new_column_A new_column_B A C 30 | B B 50 | C A 100 …
0
votes
0 answers

Difference in bins distribution between Matplotlib & Holoviews

ALL software version info Python 3.7.4; On iMac (21.5-inch, 2017); Using IDLE. Description of expected behavior and the observed behavior Problem is: Different bins distribution between Matplotlib & Holoviews is obtained. Complete, minimal,…
mmolet
  • 1
0
votes
1 answer

Get bins range from temporary table SQL

I have a question related to my previous one. What I have is a database that looks like: category price date ------------------------- Cat1 37 2019-03 Cat2 65 2019-03 Cat3 34 2019-03 Cat1 45 2019-03 Cat2 …
0
votes
2 answers

Creating a histogram in R knowing bin heights

I'm trying to make a histogram in R in a sort of backwards manner, where I already know how many bins I want, and how many observations are in each bin. My data looks like this Interval 0-2 2-4 4-6 6-10 10-15 15-25 >25 Number of…
0
votes
1 answer

Bins for fixed interval in Longitudinal data and plotting it over the period of time by categories

It is longitudinal data; ID wise values are repeating 4 times in every tick of 20 steps. Then this experiments repeats. For the datafarme below I want bins based for every tick time steps for the categories of land based on the values of X. Bins can…
Sadaf
  • 163
  • 7
0
votes
1 answer

Split range of values into bins of the same number

I have values ranging from 0.105 to 15.589 representing fold change in gene expression. I've tried splitting these into bins using df$bin <- cut(df$FC, breaks=c(seq(min(df$FC),max(df$FC),length.out = 50))) giving me 50 bins containing different…
sian
  • 77
  • 7
0
votes
1 answer

Binning pandas/numpy array in unequal sizes with approx equal computational cost

I have a problem where data must be processed across multiple cores. Let df be a Pandas DataFrameGroupBy (size()) object. Each value represent the computational "cost" each GroupBy has for the cores. How can I divide df into n-bins of unequal sizes…