Histogram

class histpy.Histogram(edges, contents=None, sumw2=None, labels=None, axis_scale=None, sparse=None, unit=None)[source]

Bases: object

This is a wrapper of a numpy array with axes and a fill method. Sparse array from pydata’s sparse package are also supported.

Like an array, the histogram can have an arbitrary number of dimensions.

Standard numpy array indexing is supported to get the contents –i.e. h[:], h[4], h[[1,3,4]], h[:,5:50:2]], etc.–. However, the meaning of the -1 index is different. Instead of counting from the end, -1 corresponds to the underflow bin. Similarly, an index equal to the number of bins corresponds to the overflow bin.

You can however give relative position with respect to h.end –e.g. h[0:h.end] result in all regular bins, h[-1:h.end+1] includes also the underflow/overflow bins and h[h.end] gives you the contents of the overflow bin. The convenient aliases h.uf = -1, h.of = e.end and h.all = slice(-1,h.end+1) are provided.

You can also use an Ellipsis object (...) at the end to specify that the contents from the rest of the dimension are to have the under and overflow bins included. e.g. for a 3D histogram h[1,-1:h.end+1,-1:h.end+1] = h[1,...]. h[:] returns all contents without under/overflow bins and h[…] returns everything, including those special bins.

If sumw2 is not None, then the histogram will keep track of the sum of the weights squared –i.e. you better use this if you are using weighted data and are concern about error bars–. You can access these with h.sumw2[item], where item is interpreted the same was a in h[item]. h.bin_error[item] return the sqrt(sumw2) (or sqrt(contents) is sumw2 was not specified).

The operators +, -, * and - are available. Both other operand can be a histograms, a scalar or an array of appropiate size. Note that h += h0 is more efficient than h = h + h0 since latter involves the instantiation of a new histogram.

Parameters:
  • edges (Axes or array) – Definition of bin edges, Anything that can be processes by Axes. Lower edge value is included in the bin, upper edge value is excluded.

  • contents (array or SparseArray) – Initialization of histogram contents. Might or might not include under/overflow bins. Initialize to 0 by default.

  • sumw2 (None, bool or array) – If True, it will keep track of the sum of the weights squared. You can also initialize them with an array

  • labels (array of str) – Optionally label the axes for easier indexing

  • axis_scale (str or array) – Bin center mode e.g. “linear” or “log”. See Axis.axis_scale.

  • sparse (bool) – Initialize an empty sparse histogram. Only relevant if no contents are provided.

to(unit, equivalencies=[], update=True, copy=True)[source]

Convert to other units.

Parameters:
  • unit (unit-like) – Unit to convert to.

  • equivalencies (list or tuple) – A list of equivalence pairs to try if the units are not directly convertible.

  • update (bool) – If update is False, only the units will be changed without updating the contents accordingly

  • copy (bool) – If True (default), then the value is copied. Otherwise, a copy will only be made if necessary.

property is_sparse

Return True if the underlaying histogram contents are hold in a sparse array. False if it is a dense array.

to_dense()[source]

Return a dense representation of a sparse histogram

to_sparse()[source]

Return a sparse representation of a histogram.

todense()

Return a dense representation of a sparse histogram

classmethod concatenate(edges, histograms, label=None)[source]

Generate a Histogram from a list of histograms. The axes of all input histograms must be equal, and the new histogram will have one more dimension than the input. The new axis has index 0.

Parameters:
  • edges (Axes or array) – Definition of bin edges of the new dimension

  • histograms (list of Histogram) – List of histogram to fill contents. Might or might not include under/overflow bins.

  • labels (str) – Label the new dimension

Returns:

Histogram

clear()[source]

Set all counts to 0

property contents

Equivalent to h[:]. Does not include under and overflow bins.

property full_contents

Equivalent to h[...]. Includes all under and overflow bins.

property axes

Underlaying axes object

property axis

Equivalent to self.axes[0], but fails if ndim > 1

interp(*values)[source]

Get a linearly interpolated content for a given set of values along each axes. The bin contents are assigned to the center of the bin.

Parameters:

values (float or array) – Coordinates within the axes to interpolate. Must have the same size as ndim. Input values as (1,2,3) or ([1,2,3])

Returns:

float

fill(*values, weight=None)[source]

And an entry to the histogram. Can be weighted.

Follow same convention as find_bin()

Parameters:
  • values (float or array) – Value of entry

  • weight (float) – Value weight in histogram. Defaults to 1 in whatever units the histogram has

Note

Note that weight needs to be specified explicitely by key, otherwise it will be considered a value an a IndexError will be thrown.

project(*axis)[source]

Return a histogram consisting on a projection of the current one

Parameters:

axis (int or str or list) – axis or axes onto which the histogram will be projected –i.e. will sum up over the other dimensiones–. The axes of the new histogram will have the same order –i.e. you can transpose axes–

Returns:

Projected histogram

Return type:

Histogram

clear_overflow(axes=None)[source]

Set all overflow bins to 0, including sumw2

Parameters:

axes (None or array) – Axes number or labels. All by default

clear_underflow(axes=None)[source]

Set all overflow bins to 0, including sumw2

Parameters:

axes (None or array) – Axes number or labels. All by default

clear_underflow_and_overflow(*args, **kwargs)[source]

Set all underflow and overflow bins to 0, including sumw2

Parameters:

axes (None or array) – Axes number or labels. All by default

expand_dims(*args, **kwargs)[source]

Same as h.axes.expand_dims().

broadcast(*args, **kwargs)[source]

Same as h.axes.broadcast().

expand_dict(*args, **kwargs)[source]

Same as h.axes.expand_dict().

rebin(*ngroup)[source]

Rebin the histogram by grouping multiple bins into a single one.

Parameters:

ngroup (int or array) – Number of bins that will be combined for each bin of the output. Can be a single number or a different number per axis. A number <0 indicates that the bins will start to be combined starting from the last one.

Returns:

Histogram

plot(ax=None, errorbars=None, colorbar=True, label_axes=True, **kwargs)[source]

Quick plot of the histogram contents.

Under/overflow bins are not included. Only 1D and 2D histograms are supported.

Histogram with a HealpixAxis will automatically be plotted as a map, passing all kwargs to mhealpy’s HealpixMap.plot()

Parameters:
  • ax (matplotlib.axes) – Axes on where to draw the histogram. A new one will be created by default.

  • errorbars (bool or None) – Include errorbar for 1D histograms. The default is to plot them if sumw2 is available

  • colorbar (bool) – Draw colorbar in 2D plots

  • label_axes (bool) – Label plots axes. Histogram axes must be labeled.

  • **kwargs – Passed to matplotlib.errorbar() (1D) or matplotlib.pcolormesh (2D)

draw(ax=None, errorbars=None, colorbar=True, label_axes=True, **kwargs)

Quick plot of the histogram contents.

Under/overflow bins are not included. Only 1D and 2D histograms are supported.

Histogram with a HealpixAxis will automatically be plotted as a map, passing all kwargs to mhealpy’s HealpixMap.plot()

Parameters:
  • ax (matplotlib.axes) – Axes on where to draw the histogram. A new one will be created by default.

  • errorbars (bool or None) – Include errorbar for 1D histograms. The default is to plot them if sumw2 is available

  • colorbar (bool) – Draw colorbar in 2D plots

  • label_axes (bool) – Label plots axes. Histogram axes must be labeled.

  • **kwargs – Passed to matplotlib.errorbar() (1D) or matplotlib.pcolormesh (2D)

fit(f, lo_lim=None, hi_lim=None, **kwargs)[source]

Fit histogram data using least squares.

This is a convenient call to scipy.optimize.curve_fit. Sigma corresponds to the output of h.bin_error. Empty bins (e.g. error equals 0) are ignored

Parameters:
  • f (callable) – Function f(x),… that takes the independent variable x as first argument, and followed by the parameters to be fitted. For a k-dimensional histogram is should handle arrays of shape (k,) or (k,N).

  • lo_lim (float or array) – Low axis limit to fit. One value per axis.

  • lo_lim – High axis limit to fit. One value per axis.

  • **kwargs – Passed to scipy.optimize.curve_fit

write(filename, name='hist', overwrite=False)[source]

Write histogram to disk.

It will be save as a group in a HDF5 file. Appended if the file already exists.

Parameters:
  • filename (str) – Path to file

  • name (str) – Name of group to save histogram (can be any HDF5 path)

  • overwrite (str) – Delete and overwrite group if already exists.

classmethod open(filename, name='hist')[source]

Read histogram from disk.

Parameters:
  • filename (str) – Path to file

  • name (str) – Name of group where the histogram was saved.