Histogram function

Nikolai Shokhirev

July 15, 2012


The histogram functions [1] are widely used in probability density function (PDF) estimation [2].


A histogram is a piecewise constant function: \begin{equation} f\left(x,\vec{p}\right)=\sum_{m=0}^{M-1}p_{m}\Pi_{m}(x)\,,\: a\leq x\leq b\label{eq:hist_def} \end{equation} Here \begin{equation} \Pi_{m}(x)=\Pi^{(h)}(x-mh) \end{equation} where \begin{equation} \Pi^{(h)}(x)=\begin{cases} 0 , & x \lt 0 \\ 1 , & 0 \le x \le h \\ 0 , & h \le x \end{cases} \end{equation} is a rectangular functions of width (bin size) $h$ and \begin{equation} h=\frac{b-a}{M} \end{equation} The normalization condition \begin{equation} \mathcal{N}(\vec{p})=\intop_{a}^{b}f\left(x,\vec{p}\right)dx=1\label{eq:norm-cond} \end{equation} reduces to \begin{equation} \mathcal{N}(\vec{p})=h\sum_{m=0}^{M-1}p_{m}=1\label{eq:norm} \end{equation} It is also required that $p_{m}\geq0$ for all $m$ if (\ref{eq:hist_def}) represent a probability density function.


Note that (\ref{eq:hist_def}) is a smooth function of $p_{m}$ and the derivatives are \begin{equation} \frac{\partial}{\partial p_{m}}f\left(x,\vec{p}\right)=\Pi_{m}(x)\label{eq:hist_deriv} \end{equation} This property is used in PDF fitting. Integration of $\Pi_{m}$ with any function gives its average value over the $m$-th interval: \begin{equation} \intop_{a}^{b}\Pi_{m}(x)y(x)dx=\intop_{mh}^{(m+1)h}y(x)dx=h\left\langle \, y\,\right\rangle _{m}\label{eq:avg} \end{equation} The $\Pi$ functions are orthogonal: \begin{equation} \Pi_{m}(x)\Pi_{k}(x)=\delta_{m,k}\Pi_{m}(x)\label{eq:prod} \end{equation} and \begin{equation} \intop_{a}^{b}\Pi_{m}(x)\Pi_{k}(x)dx=h\delta_{m,k}\label{eq:ort} \end{equation}


In Eq. (\ref{eq:hist_def}) we can relax the requirement of an equal width of all$\Pi$-functions. The definition (\ref{eq:Pi-func}) is replaced with \begin{equation} \Pi_{m}(x)=\begin{cases} 0, & x \lt x_{m}\\ 1, & x_{m}\leq x \lt x_{m+1}\\ 0, & x_{m+1}\leq x \end{cases} \end{equation} Here \begin{equation} a=x_{0}\lt x_{1} \lt \cdots \lt x_{M-1}\lt x_{M}=b \end{equation} and the widths are \begin{equation} h_{m}=x_{m+1}-x_{m} \end{equation} The equations (\ref{eq:hist_def}), (\ref{eq:hist_deriv}) and (\ref{eq:prod}) remain unchanged. Eq. (\ref{eq:norm}) reduces to \begin{equation} \mathcal{N}(\vec{p})=\sum_{m=0}^{M-1}h_{m\,}p_{m}=1\label{eq:norm-1} \end{equation} In Eqs. (\ref{eq:avg}) and (\ref{eq:ort}) $h$ should be replaced with $h_{m}$ .

Histogram Bin-width Optimization

See the links below.


  1. Histogram. - Wikipedia. Also about bin size selection.
  2. Density estimation.
  3. Histogram Bin-width Optimization.
  4. A Method for Selecting the Bin Size of a Time Histogram.
  5. A recipe for optimizing a time-histogram.
  6. Data-Based Choice of Histogram Bin Width.
  7. Selecting the Number of Bins in a Histogram: A Decision Theoretic Approach.
  8. Multiresolution Histograms and their Use for Texture Classification.
  9. A Fast Implementation of Adaptive Histogram Equalization.
  10. On Selecting The Number Of Bins For A Histogram.
  11. Dynamic Histograms: Capturing Evolving Data Sets.
  12. REHIST: Relative Error Histogram Construction Algorithms.
  13. The problem with Sturges' rule for constructing histograms.
  14. Maximizing the entropy of histogram bar heights to explore neural activity: a simulation study on auditory and tactile fibers.

© Nikolai Shokhirev, 2012-2017

email: nikolai(dot)shokhirev(at)gmail(dot)com