Questions tagged [histogram]

In statistics, a histogram is a graphical representation, showing a visual impression of the distribution of data.

In statistics, a histogram is a graphical representation, showing a visual impression of the distribution of data. It is an estimate of the probability distribution of a continuous variable and was first introduced by Karl Pearson. A histogram consists of tabular frequencies, shown as adjacent rectangles, erected over discrete intervals (bins), with an area equal to the of the observations in the interval. The height of a rectangle is also equal to the frequency density of the interval, i.e., the frequency divided by the width of the interval. The total area of the histogram is equal to the number of data. A histogram may also be normalized displaying relative frequencies. It then shows the proportion of cases that fall into each of several categories, with the total area equaling 1. The categories are usually specified as consecutive, non-overlapping intervals of a variable. The categories (intervals) must be adjacent, and often are chosen to be of the same size.

Histograms are used to plot density of data, and often for density estimation: estimating the probability density function of the underlying variable. The total area of a histogram used for probability density is always normalized to 1. If the length of the intervals on the x-axis are all 1, then a histogram is identical to a relative frequency plot.

In scientific software for statistical computing and graphics, The function hist generates a histogram. It can also optionally scale it so that its total area is 1. This puts it in the right scale if one want to overlay a probability density curve.

More about it here : histogram wiki

6663 questions
14
votes
1 answer

ggplot2 histogram with density curve that sums to 1

Plotting a histogram with a density curve that sums to 1 for non-standardized data is ridiculously difficult. There are many questions already about this, but none of their solutions work for my data. There needs to be a simple solution that just…
CoderGuy123
  • 6,219
  • 5
  • 59
  • 89
14
votes
1 answer

Histogram with weights in R

I need to plot a weighted histogram of density rather than frequency. I know that freq = FALSE is available in hist() but you can't specify weights. In ggplot2 I can do this: library(ggplot2) w <- seq(1,1000) w <-w/sum(w) v <- sort(runif(1000)) foo…
heinheo
  • 557
  • 1
  • 4
  • 15
14
votes
3 answers

R histogram with multiple populations

I'm interested in creating a histogram in R that will contain two (or more) population on top of each other, meaning - I don't want a two histograms sharing the same graph but a bar containing two colors or more. Found the image below - this is what…
Adi
  • 161
  • 1
  • 1
  • 6
14
votes
4 answers

Error using cv2.equalizeHist

I am trying to equalize the histogram of a gray level image using the following code: import cv2 im = cv2.imread("myimage.png") eq = cv2.equalizeHist(im) The following exception is raised: error: (-215) CV_ARE_SIZES_EQ(src, dst) &&…
Shan
  • 18,563
  • 39
  • 97
  • 132
14
votes
2 answers

How to make variable bar widths in ggplot2 not overlap or gap

geom_bar seems to work best when it has fixed width bars - even the spaces between bars seem to be determined by width, according to the documentation. When you have variable widths, however, it does not respond as I would expect, leading to…
RobinLovelace
  • 4,799
  • 6
  • 29
  • 40
14
votes
4 answers

Any way to create histogram with matplotlib.pyplot without plotting the histogram?

I am using matplotlib.pyplot to create histograms. I'm not actually interested in the plots of these histograms, but interested in the frequencies and bins (I know I can write my own code to do this, but would prefer to use this package). I know I…
user1175720
  • 175
  • 1
  • 1
  • 6
14
votes
3 answers

get bins coordinates with hexbin in matplotlib

I use matplotlib's method hexbin to compute 2d histograms on my data. But I would like to get the coordinates of the centers of the hexagons in order to further process the results. I got the values using get_array() method on the result, but I…
user1151446
  • 1,845
  • 3
  • 15
  • 22
13
votes
2 answers

Multiple histograms in ggplot2

Here is a short part of my data: dat <-structure(list(sex = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("male", "female"), class = "factor"), A = c(1, 2, 0, 2, 1, 2, 2, 0, 2, 0, 1, 2, 2,…
Sacha Epskamp
  • 46,463
  • 20
  • 113
  • 131
13
votes
2 answers

Percentage histogram with facet_wrap

I am trying to combine percentage histogram with facet_wrap, but the percentages are not calculated based on group but all data. I would like each histogram to show distribution in a group, not relative to all population. I know it is possible to do…
AAAA
  • 461
  • 6
  • 22
13
votes
3 answers

Smoothed 2D histogram using matplotlib and imshow

I try to do a 2D histogram plot and to obtain a "smooth" picture by a sort of interpolation. Thus I do the following combining plt.hist2d and plt.imshow import matplotlib.pyplot as plt import numpy as np data = np.loadtxt("parametre_optMC.dat",…
Ger
  • 9,076
  • 10
  • 37
  • 48
13
votes
2 answers

Looking for a Histogram Binning algorithm for decimal data

I need to generate bins for the purposes of calculating a histogram. Language is C#. Basically I need to take in an array of decimal numbers and generate a histogram plot out of those. Haven't been able to find a decent library to do this…
Jay Stevens
  • 5,863
  • 9
  • 44
  • 67
13
votes
1 answer

Fill histograms (array reduction) in parallel with OpenMP without using a critical section

I would like to fill histograms in parallel using OpenMP. I have come up with two different methods of doing this with OpenMP in C/C++. The first method proccess_data_v1 makes a private histogram variable hist_private for each thread, fills them in…
user2088790
13
votes
3 answers

How to compute "EMD" for 2 numpy arrays i.e "histogram" using opencv?

Since I'm new to opencv, I don't know how to use the cv.CalcEMD2 function with numpy arrays. I have two arrays: a=[1,2,3,4,5] b=[1,2,3,4] How can I transfer numpy array to CVhistogram and from Cvhistogram to the function parameter signature? I…
Someone Someoneelse
  • 487
  • 2
  • 8
  • 27
13
votes
2 answers

Adding key legend to multi-histogram plot in R

How do I add a key legend to the below plot I whish to have a key legend somewhere in the upper right corner with two short horizontal color bars, where the red one should say "Plastic surgery gone wrong" and the blue one should say "Germany". I…
TMOTTM
  • 3,286
  • 6
  • 32
  • 63
13
votes
2 answers

add horizontal line histogram gnuplot

I would like to add a horizontal line in my histogram in gnuplot, is that possible? My histogram has on the x axis: alea1 alea 2 alea3 nalea1 nalea 2 nalea 3 and the y axis goes from 0 to 25. At 22, I want to add a horizontal line that goes all the…