Histogram.py - Various functions to deal with histograms

Author:
Release:$Id$
Date:December 09, 2013
Tags:Python

Histograms can be calculated from a list/tuple/array of values. The histogram returned is then a list of tuples of the format [(bin1,value1), (bin2,value2), ...].

Histogram.CalculateFromTable(dbhandle, field_name, from_statement, num_bins=None, min_value=None, max_value=None, intervals=None, increment=None)

get a histogram using an SQL-statement. Intervals can be either supplied directly or are build from the data by providing the number of bins and optionally a minimum or maximum value.

If no number of bins are provided, the bin-size is 1.

This command uses the INTERVAL command from MYSQL, i.e. a bin value determines the upper boundary of a bin.

Histogram.CalculateConst(values, num_bins=None, min_value=None, max_value=None, intervals=None, increment=None, combine=None)

calculate a histogram based on a list or tuple of values.

Histogram.Calculate(values, num_bins=None, min_value=None, max_value=None, intervals=None, increment=None, combine=None, no_empty_bins=0, dynamic_bins=False, ignore_out_of_range=True)

calculate a histogram based on a list or tuple of values.

use scipy for calculation.

Histogram.Scale(h, scale=1.0)

rescale bins in histogram.

Histogram.Convert(h, i, no_empty_bins=0)

add bins to histogram.

Histogram.Combine(source_histograms, missing_value=0)

combine a list of histograms Each histogram is a sorted list of bins and counts. The counts can be tuples.

Histogram.Print(h, intervalls=None, format=0, nonull=None, format_value=None, format_bin=None)

print a histogram.

A histogram can either be a list/tuple of values or a list/tuple of lists/tuples where the first value contains the bin and second contains the values (which can again be a list/tuple).

format
0 = print histogram in several lines 1 = print histogram on single line
Histogram.Write(outfile, h, intervalls=None, format=0, nonull=None, format_value=None, format_bin=None)

print a histogram.

A histogram can either be a list/tuple of values or a list/tuple of lists/tuples where the first value contains the bin and second contains the values (which can again be a list/tuple).

Parameters:format – output format. 0 = print histogram in several lines, 1 = print histogram on single line
Histogram.Fill(h)

fill every empty value in histogram with previous value.

Histogram.Add(h1, h2)

adds values of histogram h1 and h2 and returns a new histogram

Histogram.SmoothWrap(histogram, window_size)

smooth histogram by sliding window-method, where the window is wrapped around the borders. The sum of all values is entered at center of window.

Histogram.PrintAscii(histogram, step_size=1)

print histogram ascii-style.

Histogram.Count(data)

count categorized data. Returns a list of tuples with (count, token).

Histogram.Accumulate(h, num_bins=2, direction=1)

add successive counts in histogram. Bins are labelled by group average.

Histogram.Cumulate(h, direction=1)

calculate cumulative distribution.

Histogram.AddRelativeAndCumulativeDistributions(h)

adds relative and cumulative percents to a histogram.

Histogram.histogram(values, mode=0, bin_function=None)

Return a list of (value, count) pairs, summarizing the input values. Sorted by increasing value, or if mode=1, by decreasing count. If bin_function is given, map it over values first. Ex: vals = [100, 110, 160, 200, 160, 110, 200, 200, 220] histogram(vals) ==> [(100, 1), (110, 2), (160, 2), (200, 3), (220, 1)] histogram(vals, 1) ==> [(200, 3), (160, 2), (110, 2), (100, 1), (220, 1)] histogram(vals, 1, lambda v: round(v, -2)) ==> [(200.0, 6), (100.0, 3)]

Histogram.cumulate(histogram)

cumulate histogram in place.

histogram is list of (bin, value) or (bin, (values,) )

Histogram.normalize(histogram)

normalize histogram in place.

histogram is list of (bin, value) or (bin, (values,) )

Histogram.fill(iterator, bins)

fill a histogram from bins.

The values are given by an iterator so that the histogram can be built on the fly.

Description:

Count the number of times values from array a fall into numerical ranges defined by bins. Range x is given by bins[x] <= range_x < bins[x+1] where x =0,N and N is the length of the bins array. The last range is given by bins[N] <= range_N < infinity. Values less than bins[0] are not included in the histogram.

Arguments:
iterator – The iterator. bins – 1D array. Defines the ranges of values to use during histogramming.

Returns: 1D array. Each value represents the occurences for a given bin (range) of values.

Histogram.fillHistograms(infile, columns, bins)

fill several histograms from several columns in a file.

The histograms are built on the fly.

Description:

Count the number of times values from array a fall into numerical ranges defined by bins. Range x is given by bins[x] <= range_x < bins[x+1] where x =0,N and N is the length of the bins array. The last range is given by bins[N] <= range_N < infinity. Values less than bins[0] are not included in the histogram.

Arguments:
file – The input file. columns – columns to use bins – a list of 1D arrays. Defines the ranges of values to use during histogramming.

Returns: a list of 1D arrays. Each value represents the occurences for a given bin (range) of values.

WARNING: missing value in columns are ignored

Previous topic

CorrespondenceAnalysis.py -

Next topic

Histogram2D.py - functions for handling two-dimensional histograms.

This Page