Histogram
- class histpy.Histogram(edges, contents=None, sumw2=None, labels=None, axis_scale=None, sparse=None, unit=None, track_overflow=None, dtype=None, copy_contents=True)[source]
Bases:
objectThis is a wrapper of a numpy array with axes and a fill method. Sparse array from pydata’s sparse package are also supported.
Like an array, the histogram can have an arbitrary number of dimensions.
Standard numpy array indexing is supported to get the contents –i.e.
h[:],h[4],h[[1,3,4]],h[:,5:50:2]], etc.–. However, the meaning of the-1index is different. Instead of counting from the end,-1corresponds to the underflow bin. Similarly, an index equal to the number of bins corresponds to the overflow bin.You can however give relative position with respect to
h.end–e.g.h[0:h.end]result in all regular bins,h[-1:h.end+1]includes also the underflow/overflow bins andh[h.end]gives you the contents of the overflow bin. The convenient aliasesh.uf = -1,h.of = e.endandh.all = slice(-1,h.end+1)–orslice(0,h.end)if the under/overflow are not tracked– are provided.You can also use an
Ellipsisobject (...) at the end to specify that the contents from the rest of the dimension are to have the under and overflow bins included. e.g. for a 3D histogramh[1,-1:h.end+1,-1:h.end+1] = h[1,...].h[:]returns all contents without under/overflow bins and h[…] returns everything, including those special bins.If no initial contents are provided, all axes track under/overflow by default. If contents are provided, an axis tracks under/overflow by default if the provided contents has two more bins in that dimension than axis.nbins. You can specify that certain axes will/will not track under/overflow with the
track_overflowkeyword. Note that attempting to access an underflow/overflow bin on an axis that is not tracked will result in an IndexError.If
sumw2is notNone, then the histogram will keep track of the sum of the weights squared. You should use this feature if you are using weighted data and are concerned about error propagation. You can access the sum of squared wieghts withh.sumw2[item], where item is interpreted the same way as inh[item].h.bin_error[item]return thesqrt(sumw2)(orsqrt(contents)issumw2 was not specified).The binary operators
+,-,*and/are supported for Histograms and correctly propagate error if sumw2 is present. The other operand can be a Histogram, a scalar or an array of appropiate size. Note thath += h0is more efficient thanh = h + h0since latter involves the instantiation of a new histogram. Unary negation of a Histogram is supported, as is dividing a scalar by a Histogram.- Parameters:
edges (Axes or array) – Definition of bin edges, as anything that can be processes by Axes. Lower edge value is included in the bin, upper edge value is excluded.
contents (array-like or SparseArray) – Initialization of histogram contents. May include overflow/underflow bins if overflow is being tracked; if tracking is enabled and contents does not have these bins, they will be initialized to zeros. If omitted, creates an array of zeros.
sumw2 (None, bool or array) – If not None, the histogram will maintain squared weights associated with the elements of contents. These weights are initially zero if sumw2 = True but may instead be initialized explicitly with an array. Arithmetic between two histograms with squared weights propagates these weights to the result according to propagation-of-error theory.
labels (array of str) – Optionally label the axes for easier indexing
axis_scale (str or array) – Bin center mode e.g. “linear” or “log”. See
Axis.axis_scale.sparse (bool) – indicate if contents and sumw2 should be maintained as dense or sparse arrays. If specified, contents and sumw2 will be converted to the specified sparsity if needed (but attempting to densify a sparse matrix will fail to avoid unexpected memory blowups). If not specified, the Histogram’s sparsity follows that of the provided contents, or is dense if no contents are provided.
unit (Unit-like) – unit of contents; if not specified, inferred from contents if u.Quantity or None otherwise
track_overflow (bool, array-like, or dict) –
Whether to allocate space to track the underflow and overflow bins. Acceptable forms include
a single boolean value (applies to all axes)
a 1-D array-like with a boolean value per axis
a dictionary specifying boolean values for a set of named/numbered axes. For axes not in the dict, the default is True.
If this parameter is not provided, the default behavior depends on the value of the contents argument.
if contents is not provided, overflow is not tracked on any axis.
if contents is provided, each axis tracks overflow if contents includes overflow/underflow bins for that axis, i.e., if its size along that axis is two more than the axis’ number of bins.
dtype – Numpy datatype or None type of contents array; if None, use type of provided contents, or default (float64) if none provided.
copy_contents (bool) – if True (default), numpy arrays or Quantity arrays passed as contents and sumw2 will not be copied unless necessary; hence, the Histogram’s memory may alias these values.
- set_sumw2(sumw2, copy=True)[source]
Set the sumw2 matrix to a Histogram to track the sum of error weights. If not None/False, sumw2 must be either an array-like with the same shape as contents or a Histogram with the same axes as contents. It will be coerced to have the same units, sparsity, dtype, and overflow tracking as the base Histogram.
- Parameters:
sumw2 – values for weights: True for all zeros, or a Histogram, or an array-like; None or False to remove any existing sumw2
copy – bool if True (default), copy the object passed as sumw2; otherwise, create a view into the object if possible.
- copy()[source]
Make a deep copy of a Histogram. The copy shares no writable members with the original; the only shared members are those that will never be mutated.
This function preserves subclass types if called from a derived class. Subclasses with additional data members may override this function; if they do not, their data members will be deepcopied.
- astype(dtype, copy=True)[source]
Cast the contents and, if present, the sumw2 of a Histogram to a different data type. If the new type differs from the old type, we always return a copy; otherwise, we return a copy if copy=True or the original if copy=False.
- track_overflow(track_overflow=None)[source]
Obtain an array specifying whether each axis is tracking underflows and overflows. If input is not None, adjust the track_overflow settings to those provided.
- Parameters:
track_overflow (bool, array-like, or dict) – Optional. New overflow tracking settings
- Returns:
np.ndarray with copy of current overflow tracking settings
We return a copy to external callers because it is unsafe to modify a live track_overflow array in place; any updates must be fed back to track_overflow (internally, to _update_track_overflow) to take effect.
- to(unit, equivalencies=[], update=True, copy=True)[source]
Convert a Histogram to a different unit.
- Parameters:
unit (unit-like) – Unit to convert to.
equivalencies (list or tuple) – A list of equivalence pairs to try if the units are not directly convertible.
update (bool) – If
updateisFalse, only the units will be changed without updating the contents accordinglycopy (bool) – If True (default), then the value is copied. Otherwise, a copy will only be made if necessary.
- property is_sparse
Return True if the underlyying histogram contents array is sparse, or False if dense.
- todense()
Return a dense copy of a histogram
- tosparse()
Return a sparse copy of a histogram.
- property contents
Equivalent to
h[:]. Does not include under and overflow bins.
- property full_contents
Equivalent to
h[...]. The size of each axis can be nbins or nbins+2, depending on the track_overflow parameters
- property shape
Tuple with length of each axis
- property axes
Underlying axes object
- property axis
Equivalent to
self.axes[0], but fails ifndim > 1
- interp(*values, kind='linear')[source]
Perform multilinear interpolation of one or more values relative to the contents of this Histogram. The center of histogram bin (i1,…,in) is assumed to have the value h[i1,…,in] for purposes of interpolation.
If all axes of the histogram have log scale, multilinear interpolation is performed in the log domain. Hence, for example, interpolating a value halfway between two bin centers along a log-scale axis returns the geometric mean of the values in those bins. If log-domain interpolation is requested, the histogram’s contents should all be > 0 to avoid warnings or errors.
Interpolation will raise an error if called on a histogram containing both log- and linear/symmetric-scale axes, as the result of multilinear interpolation is ill-defined in this case.
- Parameters:
values (scalar or array-like) –
value(s) to interpolate If single value, may be ndim coordinates
as separate arguments or a single array-like
- if multiple values, may be ndim array-likes
of coordinates as separate arguments or a single array-like containing same
kind (string) – “linear” (default) if multilinear interpolation is to be done using this Histogram’s contents, or “log” if it is to be done on the log of the contents and then converted back to the linear domain.
- Returns:
interpolated values (scalar or array of same shape as values)
- fill(*values, weight=None, warn_overflow=True)[source]
Add an entry to the histogram. Can be weighted.
Follow same convention as find_bin()
- Parameters:
values (float or array) – Value of entry
weight (float) – Value weight in histogram. Defaults to 1 in whatever units the histogram has
warn_overflow (bool) – Enable/disable warnings when an underflow or overflow occurs –i.e. when one or more of the input values falls beyond the range of the corresponding axis.
Note
Note that weight needs to be specified explicitly by key; otherwise it will be considered a value, and an IndexError will be thrown.
- project(*axis)[source]
Return a histogram containing a projection of the current one.
- Parameters:
axis (axis index/label or array-like of same) – axis or axes onto which the histogram will be projected. Omitted axes will be summed over. The axes of the projected histogram will have the order specified by this argument, so project() can be used to permute a Histogram’s axes (whether or not some are projected away).
- Returns:
Projected histogram (a new object, not a view)
- Return type:
- project_out(*axis)[source]
Return a histogram containing a projection that sums over the specified axes of the current one, leaving the rest intact.
- Parameters:
axis (axis index/label or array-like of same) – axis or axes that will be projected out of the histogram. Omitted axes will be retained in their current order.
- Returns:
Projected histogram (a new object, not a view)
- Return type:
- static concatenate(edges, histograms, label=None, track_overflow=None)[source]
Generate a Histogram H from a list of histograms h_1 … h_n. We create a new first axis of length equal to the list and set H[i] = h_i.
For this operation to be well-defined, the axes of all input histograms must be equal, and they must all have the same sparsity; if any input has a unit, all must have compatible units. If any input is a subclass of Histogram, all must have the same subclass type.
If all inputs have sumw2, the output will as well; otherwise, all sumw2 values are discarded.
Generate a Histogram from a list of histograms. The axes of all input histograms must be equal, and the new histogram will have one more dimension than the input. The new axis has index 0. If histograms can be subclassed, all of them must have the same class type.
- Parameters:
edges (Axes or array) – Definition of bin edges of the new dimension
histograms (list of Histogram) – List of histogram to fill contents. Might or might not include under/overflow bins.
labels (str) – Label the new dimension
track_overflow (bool) – Track underflow and overflow on the newly created axis. Defaults to True if number of histograms is 2 + # bins on new axis, or False otherwise.
- Returns:
new object of the same type as histograms[0] (Histogram or subclass)
- rebin(*ngroup)[source]
Rebin a histogram by grouping adjacent bins into one on each axis
If an axis does not have overflow tracking enabled, any partial group along that axis will be discarded. If it does have overflow tracking enabled, any partial group’s sum will be added to the axis’ underflow bin if it is on the left, or to the overflow bin if it is on the right.
For histograms with multiple axes, the result of rebinning is equivalent to rebinning the input on the first axis, then rebinning the result on the second axis, and so forth for all axes.
- Parameters:
ngroup (int or array-like) – number of adjacent bins to combine for each axis. If this value is > 0 for an axis, binning starts from left side of contents, so the last partial group (if any) is on the right; if < 0, binning starts from right side, so last partial group (if any) is on the left.
- Returns:
a new, rebinned Histogram
- plot(ax=None, errorbars=None, colorbar=True, label_axes=True, **kwargs)[source]
Quick plot of the histogram contents.
Under/overflow bins are not included. Only 1D and 2D histograms are supported.
Histogram with a HealpixAxis will automatically be plotted as a map, passing all kwargs to mhealpy’s HealpixMap.plot()
- Parameters:
ax (matplotlib.axes) – Axes on which to draw the histogram. A new one will be created by default.
errorbars (bool or None) – Include errorbars for 1D histograms. The default is to plot them if sumw2 is available
colorbar (bool) – Draw colorbar in 2D plots
label_axes (bool) – Label plots axes. Histogram axes must be labeled.
**kwargs – Passed to matplotlib.errorbar() (1D) or matplotlib.pcolormesh (2D)
- draw(ax=None, errorbars=None, colorbar=True, label_axes=True, **kwargs)
Quick plot of the histogram contents.
Under/overflow bins are not included. Only 1D and 2D histograms are supported.
Histogram with a HealpixAxis will automatically be plotted as a map, passing all kwargs to mhealpy’s HealpixMap.plot()
- Parameters:
ax (matplotlib.axes) – Axes on which to draw the histogram. A new one will be created by default.
errorbars (bool or None) – Include errorbars for 1D histograms. The default is to plot them if sumw2 is available
colorbar (bool) – Draw colorbar in 2D plots
label_axes (bool) – Label plots axes. Histogram axes must be labeled.
**kwargs – Passed to matplotlib.errorbar() (1D) or matplotlib.pcolormesh (2D)
- fit(f, lo_lim=None, hi_lim=None, **kwargs)[source]
Fit histogram data using least squares.
This is a convenient call to scipy.optimize.curve_fit. Sigma corresponds to the output of h.bin_error. Empty bins (e.g. error equals 0) are ignored
- Parameters:
f (callable) – Function f(x),… that takes the independent variable x as first argument, and followed by the parameters to be fitted. For a k-dimensional histogram is should handle arrays of shape (k,) or (k,N). The inputs and outputs must be unitless.
lo_lim (float or array) – Low axis limit to fit. One value per axis.
lo_lim – High axis limit to fit. One value per axis.
**kwargs – Passed to scipy.optimize.curve_fit
- write(filename, name='hist', overwrite=False)[source]
Write histogram to a group in an HDF5 file. Appended if the file already exists.
- Parameters:
filename (str) – Path to file
name (str) – Name of group to save histogram (can be any HDF5 path)
overwrite (str) – Delete and overwrite group if already exists.