Questions tagged [histogram]

In statistics, a histogram is a graphical representation, showing a visual impression of the distribution of data.

In statistics, a histogram is a graphical representation, showing a visual impression of the distribution of data. It is an estimate of the probability distribution of a continuous variable and was first introduced by Karl Pearson. A histogram consists of tabular frequencies, shown as adjacent rectangles, erected over discrete intervals (bins), with an area equal to the of the observations in the interval. The height of a rectangle is also equal to the frequency density of the interval, i.e., the frequency divided by the width of the interval. The total area of the histogram is equal to the number of data. A histogram may also be normalized displaying relative frequencies. It then shows the proportion of cases that fall into each of several categories, with the total area equaling 1. The categories are usually specified as consecutive, non-overlapping intervals of a variable. The categories (intervals) must be adjacent, and often are chosen to be of the same size.

Histograms are used to plot density of data, and often for density estimation: estimating the probability density function of the underlying variable. The total area of a histogram used for probability density is always normalized to 1. If the length of the intervals on the x-axis are all 1, then a histogram is identical to a relative frequency plot.

In scientific software for statistical computing and graphics, The function hist generates a histogram. It can also optionally scale it so that its total area is 1. This puts it in the right scale if one want to overlay a probability density curve.

More about it here : histogram wiki

6663 questions
22
votes
9 answers

Matplotlib.pyplot.hist() very slow

I'm plotting about 10,000 items in an array. They are of around 1,000 unique values. The plotting has been running half an hour now. I made sure rest of the code works. Is it that slow? This is my first time plotting histograms with pyplot.
Fenwick
  • 1,061
  • 2
  • 15
  • 28
22
votes
3 answers

Histogram from data which is already binned, I have bins and frequency values

All the matplotlib examples with hist() generate a data set, provide the data set to the hist function with some bins (possibly non-uniformly spaced) and the function automatically calculates and then plots the histogram. I already have histogram…
Daniel Farrell
  • 9,316
  • 8
  • 39
  • 62
21
votes
1 answer

R - Customizing X Axis Values in Histogram

I want to change the values on the x axis in my histogram in R. The computer currently has it set as 0, 20, 40, 60, 80, 100. I want the x axis to go by 10 as in: 0,10,20,30,40,50,60,70,80,90,100. I know to get rid of the current axis I have to…
user1094628
  • 239
  • 2
  • 4
  • 5
21
votes
1 answer

Log x-axis for histogram

I'm trying to plot a histogram with a logarithmic x axis. The code I'm currently using is as follows plt.hist(data, bins=10 ** np.linspace(0, 1, 2, 3), normed=1) plt.xscale("log") However, the x axis doesn't actually plot correctly! It just goes…
student1818
  • 221
  • 1
  • 2
  • 3
21
votes
2 answers

Set y axis limit in Pandas histogram

I am using Pandas histogram. I would like to set the y-axis range of the plot. Here is the context: import matplotlib.pyplot as plt %matplotlib inline interesting_columns = ['Level', 'Group'] for column in interesting_columns: …
mrmagicfluffyman
  • 365
  • 1
  • 2
  • 7
21
votes
3 answers

Histogram matching of two images in Python 2.x?

I'm trying to match the histograms of two images (in MATLAB this could be done using imhistmatch). Is there an equivalent function available from a standard Python library? I've looked at OpenCV, scipy, and numpy but don't see any similar…
anon01
  • 10,618
  • 8
  • 35
  • 58
21
votes
3 answers

Get data points from a histogram in Python

I made a histogram of the 'cdf' (cumulative distribution) of a function. The histogram is basically No. of counts vs. luminosity. Now, how do I extract data points from a histogram? I need actual values of Luminosities. I am using Matplotlib in…
user3014593
  • 231
  • 1
  • 2
  • 5
21
votes
1 answer

pylab histogram get rid of nan

I have a problem with making a histogram when some of my data contains "not a number" values. I can get rid of the error by using nan_to_num from numpy, but than i get a lot of zero values which mess up the histogram as…
usethedeathstar
  • 2,219
  • 1
  • 19
  • 30
21
votes
4 answers

Different breaks per facet in ggplot2 histogram

A ggplot2-challenged latticist needs help: What's the syntax to request variable per-facet breaks in a histogram? library(ggplot2) d = data.frame(x=c(rnorm(100,10,0.1),rnorm(100,20,0.1)),par=rep(letters[1:2],each=100)) # Note: breaks have different…
Dieter Menne
  • 10,076
  • 44
  • 67
21
votes
2 answers

How to force a y axis to minimum and maximum range in R?

If you look at the graph below (y axis), you will notice that the scale is from 0 to 0.20. I have other histograms where the range is from 0 to 0.4. I want to make all of them consistent from 0 to 1 and display the y axis from 0 to 1. conne <-…
Barry
  • 739
  • 1
  • 8
  • 29
20
votes
1 answer

python matplotlib imshow() custom tickmarks

I'm trying to set custom tick marks on my imshow() output, but haven't found the right combination. The script below summarizes my attempts. In this script, I'm trying to make the tickmarks at all even numbers on each axis instead of the default…
zje
  • 3,824
  • 4
  • 25
  • 31
20
votes
1 answer

Plot Histogram in Python

I have two lists, x and y. x contains the alphabet A-Z and Y contains the frequency of them in a file. I've tried researching how to plot these values in a histogram but has had no success with understanding how to plot it. n, bins, patches =…
PythonAlex
  • 241
  • 2
  • 4
  • 8
20
votes
1 answer

Get a histogram plot of factor frequencies (summary)

I've got a factor with many different values. If you execute summary(factor) the output is a list of the different values and their frequency. Like so: A B C D 3 3 1 5 I'd like to make a histogram of the frequency values, i.e. X-axis contains the…
wds
  • 31,873
  • 11
  • 59
  • 84
20
votes
4 answers

matplotlib hist function argument density not working

plt.hist's density argument does not work. I tried to use the density argument in the plt.hist function to normalize stock returns in my plot, but it didn't work. The following code worked fine for me and give me the probability density function…
riversxiao
  • 369
  • 1
  • 2
  • 11
20
votes
4 answers

Understanding histogram_quantile based on rate in Prometheus

According to Prometheus documentation in order to have a 95th percentile using histogram metric I can use following query: histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le)) Source:…
evgeniy44
  • 2,862
  • 7
  • 28
  • 51