Questions tagged [histogram]

In statistics, a histogram is a graphical representation, showing a visual impression of the distribution of data.

In statistics, a histogram is a graphical representation, showing a visual impression of the distribution of data. It is an estimate of the probability distribution of a continuous variable and was first introduced by Karl Pearson. A histogram consists of tabular frequencies, shown as adjacent rectangles, erected over discrete intervals (bins), with an area equal to the of the observations in the interval. The height of a rectangle is also equal to the frequency density of the interval, i.e., the frequency divided by the width of the interval. The total area of the histogram is equal to the number of data. A histogram may also be normalized displaying relative frequencies. It then shows the proportion of cases that fall into each of several categories, with the total area equaling 1. The categories are usually specified as consecutive, non-overlapping intervals of a variable. The categories (intervals) must be adjacent, and often are chosen to be of the same size.

Histograms are used to plot density of data, and often for density estimation: estimating the probability density function of the underlying variable. The total area of a histogram used for probability density is always normalized to 1. If the length of the intervals on the x-axis are all 1, then a histogram is identical to a relative frequency plot.

In scientific software for statistical computing and graphics, The function hist generates a histogram. It can also optionally scale it so that its total area is 1. This puts it in the right scale if one want to overlay a probability density curve.

More about it here : histogram wiki

6663 questions
20
votes
4 answers

Pandas histogram df.hist() group by

How to plot a histogram with pandas DataFrame.hist() using group by? I have a data frame with 5 columns: "A", "B", "C", "D" and "Group" There are two Groups classes: "yes" and "no" Using: df.hist() I get the hist for each of the 4 columns. Now I…
Hangon
  • 2,449
  • 7
  • 23
  • 31
20
votes
4 answers

"dvipng: not found" when creating matplotlib figure

I try to plot a frequency histogram with matplotlib but it doesn t work and i don t know where is the problem... import matplotlib.pyplot as plt import matplotlib.ticker as ticker import numpy as np data = np.array([58.35, 71.83, 49.25, 38.89,…
user3601754
  • 3,792
  • 11
  • 43
  • 77
20
votes
2 answers

How to get data in a histogram bin

I want to get a list of the data contained in a histogram bin. I am using numpy, and Matplotlib. I know how to traverse the data and check the bin edges. However, I want to do this for a 2D histogram and the code to do this is rather ugly. Does…
Ben
  • 1,038
  • 1
  • 19
  • 22
20
votes
2 answers

Gnuplot Histogram Cluster (Bar Chart) with One Line per Category

Histogram Cluster / Bar Chart I'm trying to generate the following histogram cluster out of this data file with gnuplot, where each category is represented in a separate line per year in the data file: # datafile year category …
fiedl
  • 5,667
  • 4
  • 44
  • 57
20
votes
2 answers

Side by Side histograms in the Same Graph in R?

This should actually be really simple but I'm having a really hard time finding a solution to this problem. I have two very simple numeric vectors in R. I am simply trying to plot a histogram with them. However I would like them to be on the same…
user2331197
  • 213
  • 1
  • 2
  • 8
20
votes
1 answer

How can I plot a histogram in pandas using nominal values?

Given: ser = Series(['one', 'two', 'three', 'two', 'two']) How do I plot a basic histogram of these values? Here is an ASCII version of what I'd want to see in matplotlib: X X X X ------------- one two three I'm tired of…
Tim Stewart
  • 5,350
  • 2
  • 30
  • 45
19
votes
4 answers

How do I add the mean value to a histogram in R?

I would like to plot a histogram with mean (average) value on it (e.g. we could mark it with a blue and bold line). I tried to do it using plot command, but even if I set the parameter add=TRUE it didn't work.
Mateusz Kędzior
  • 193
  • 1
  • 1
  • 4
19
votes
5 answers

Meaning of Histogram on Tensorboard

I am working on Google Tensorboard, and I'm feeling confused about the meaning of Histogram Plot. I read the tutorial, but it seems unclear to me. I really appreciate if anyone could help me figure out the meaning of each axis for Tensorboard…
Ruofan Kong
  • 1,060
  • 1
  • 17
  • 34
19
votes
4 answers

Histogram with Boxplot above in Python

Hi I wanted to draw a histogram with a boxplot appearing the top of the histogram showing the Q1,Q2 and Q3 as well as the outliers. Example phone is below. (I am using Python and Pandas) I have checked several examples using matplotlib.pyplot but…
Isura Nirmal
  • 777
  • 1
  • 9
  • 26
19
votes
2 answers

weights option for seaborn distplot?

I'd like to have a weights option in seaborn distplot, similar to that in numpy histogram. Without this option, the only alternative would be to apply the weighting to the input array, which could result in an impractical size (and time).
nbecker
  • 1,645
  • 5
  • 17
  • 23
19
votes
4 answers

Python: Creating a 2D histogram from a numpy matrix

I'm new to python. I have a numpy matrix, of dimensions 42x42, with values in the range 0-996. I want to create a 2D histogram using this data. I've been looking at tutorials, but they all seem to show how to create 2D histograms from random data…
Kestrel
  • 557
  • 3
  • 8
  • 16
19
votes
4 answers

Numpy histogram of large arrays

I have a bunch of csv datasets, about 10Gb in size each. I'd like to generate histograms from their columns. But it seems like the only way to do this in numpy is to first load the entire column into a numpy array and then call numpy.histogram on…
pseudosudo
  • 6,270
  • 9
  • 40
  • 53
19
votes
2 answers

matplotlib hist() autocropping range

I am trying to make a histgram over a specific range but the matplotlib.pyplot.hist() function keeps cropping the range to the bins with entries in them. A toy example: import numpy as np import matplotlib.pyplot as plt x =…
Keith
  • 4,646
  • 7
  • 43
  • 72
19
votes
5 answers

Python Matplotlib rectangular binning

I've got a series of (x,y) values that I want to plot a 2d histogram of using python's matplotlib. Using hexbin, I get something like this: But I'm looking for something like this: Example Code: from matplotlib import pyplot as plt import…
job
  • 9,003
  • 7
  • 41
  • 50
19
votes
3 answers

Create a histogram for weighted values

If I have a vector (e.g., v<-runif(1000)), I can plot its histogram (which will look, more or less, as a horizontal line because v is a sample from the uniform distribution). However, suppose I have a vector and its associated weights (e.g.,…
sds
  • 58,617
  • 29
  • 161
  • 278