py4sci

Table Of Contents

This Page

vtools.functions package

Submodules

vtools.functions.api module

vtools.functions.diff module

Calculates left side derative of time series of input order.

time_diff(ts, order=1)

Generate left side derative of a ts of input orders.

Parameters:

ts : TimeSeries

Series to interpolate. Must has data of one dimension, and regular.

order : int

Order of derative.

Returns:

result : vtools.data.time_series.TimeSeries

A regular series with derative.

vtools.functions.diffnorm module

uility to compute ts norm (L_1,L_2,L_inf).

norm_diff_l1(ts1, ts2, time_window=None)

compute L1 difference of two input time series.

Parameters:

ts1,ts2 : TimeSeries

Must has data of one dimension, and regular.

time_window : time_window ,optional

The default implies comparing the full length of two time series.

Returns:

result : float

a total L1-difference between two input series .

norm_diff_l2(ts1, ts2, time_window=None)

compute L2 difference of two input time series.

Parameters:

ts1,ts2 : TimeSeries

Must has data of one dimension, and regular.

time_window : time_window,optional

The default implies comparing the full length of two time series.

Returns:

result : float

A total L2-difference between two input series .

norm_diff_linf(ts1, ts2, time_window=None)

compute L_inf difference of two input time series.

Parameters:

ts1,ts2: :class:`~vtools.data.timeseries.TimeSeries` :

Must has data of one dimension, and regular.

time_window : time_window,optional

The default implies comparing the full length of two time series.

Returns:

result : float

A total L_infinite difference between two input series .

ts_equal(ts1, ts2, time_window=None, tol=0.0)

compare two ts by absolute difference.

Parameters:

ts1.ts2 : TimeSeries

Must has data of one dimension, and regular.

time_window : time_window,optional

The default implies comparing the full length of two time series.

tol :float,optional :

Tolerance allowed for two equal time sereies.

Returns:

result : bool

if equal within allowed tolerance, return True .

ts_almost_equal(ts1, ts2, time_window=None, tol=1e-08)

compare two ts by absolute difference.

Parameters:

ts1,ts2 : TimeSeries

Must has data of one dimension, and regular.

time_window : time_window,optional

The default implies comparing the full length of two time series.

tol :float,optional :

Tolerance allowed for two almost equal time sereies.

Returns:

result : bool

If equal within allowed tolerance, return True .

vtools.functions.filter module

Module contains filter used in tidal time series analysis.

boxcar(ts, before, after)

low-pass butterworth filter using moving average.

Parameters:

ts1 : TimeSeries

Must has data of one dimension, and regular.

before : time_interval, int

Length of averging interval before the sampling points or number of points.

after : time_interval, int

Length of averging interval after the sampling points or number of points..

Returns:

result : TimeSeries

A new regular time series with the same interval of ts.

butterworth(ts, order=4, cutoff_period=None, cutoff_frequency=None)

low-pass butterworth-squared filter on a regular time series.

Parameters:

ts : TimeSeries

Must has data of one dimension, and regular.

order: int ,optional :

The default is 4.

cutoff_frequency: float,optional :

Cutoff frequency expressed as a ratio of a Nyquist frequency, should within the range (0,1). For example, if the sampling frequency is 1 hour, the Nyquist frequency is 1 sample/2 hours. If we want a 36 hour cutoff period, the frequency is 1/36 or 0.0278 cycles per hour. Hence the cutoff frequency argument used here would be 0.0278/0.5 = 0.056.

cutoff_period : string or time_interval

Period of cutting off frequency. If input as a string, it must be convertible to Time interval. cutoff_frequency and cutoff_period can’t be specified at the same time.

Returns:

result : TimeSeries

A new regular time series with the same interval of ts.

daily_average(ts)

Classic moving average over 24 hours.

Parameters:

ts : TimeSeries

Must has data of one dimension, and regular.

Returns:

result : TimeSeries

A new regular time series with the same interval of ts.

godin(ts)

Godin filtering method on a regular time series.

Parameters:

ts : TimeSeries

Must has data of one dimension, and regular.

Returns:

result : TimeSeries

A new regular time series with the same interval of ts.

cosine_lanczos(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)

squared low-pass cosine lanczos filter on a regular time series.

Parameters:

ts : TimeSeries

Must has data of one dimension, and regular.

filter_len : int, time_interval

Size of lanczos window, default is to number of interval within filter_period*1.25.

cutoff_frequency: float,optional :

Cutoff frequency expressed as a ratio of a Nyquist frequency, should within the range (0,1). For example, if the sampling frequency is 1 hour, the Nyquist frequency is 1 sample/2 hours. If we want a 36 hour cutoff period, the frequency is 1/36 or 0.0278 cycles per hour. Hence the cutoff frequency argument used here would be 0.0278/0.5 = 0.056.

cutoff_period : string or time_interval

Period of cutting off frequency. If input as a string, it must be convertible to Time interval. cutoff_frequency and cutoff_period can’t be specified at the same time.

padtype : str or None, optional

Must be ‘odd’, ‘even’, ‘constant’, or None. This determines the type of extension to use for the padded signal to which the filter is applied. If padtype is None, no padding is used. The default is None.

padlen : int or None, optional

The number of elements by which to extend x at both ends of axis before applying the filter. This value must be less than x.shape[axis]-1. padlen=0 implies no padding. If padtye is not None and padlen is not given, padlen is be set to 6*m.

fill_edge_nan: bool,optional

If pading is not used and fill_edge_nan is true, resulting data on the both ends are filled with nan to account for edge effect. This is 2*m on the either end of the result. Default is true.

Returns:

result : TimeSeries

A new regular time series with the same interval of ts. If no pading is used the beigning and ending 4*m resulting data will be set to nan to remove edge effect.

lowpass_cosine_lanczos_filter_coef(cf, m, normalize=True)

For test purpose only

ts_gaussian_filter(ts, sigma, order=0, mode='reflect', cval=0.0)

wrapper of scipy gaussian_filter.

Parameters:

ts: :class:`~vtools.data.timeseries.TimeSeries` :

Must has data of one dimension, and regular.

sigma: int or :ref:`time_interval<time_intervals>` :

Standard deviation for Gaussian kernel presented as number of samples, or time interval.

order: int,optional :

Order of the gaussian filter. Must be one of 0,1,2,3. Default 0.

mode : {‘reflect’, ‘constant’, ‘nearest’, ‘mirror’, ‘wrap’}, optional

This input determines how the array borders are handled, where cval is the value when mode is ‘constant’. Default is ‘reflect’

cval : scalar, optional

Value to fill past edges of input if mode is ‘constant’. Default is 0.0

Returns:

result : TimeSeries

A new regular time series with the same interval of ts.

vtools.functions.interpolate module

Some data interpolating over gaps within a time series.

linear(ts, times, filter_nan=True)

Interpolate a time series by linear interpolation.

Parameters:

ts : TimeSeries

Series to be interpolated

times : time_interval or time_sequence

The new times to which the series will be interpolated. Can also be a string that can be parsed into a time interval.

filter_nan : boolean, optional

Should nan points should be omitted or not.

Returns:

result : TimeSeries

A regular time series if times is a time_interval, or irregular time series if times is a time_sequence.

spline(ts, times, filter_nan=True, **dic)
Interpolating gaps using cubic spline.
The spline does not need to pass data point. The closeness can be controlled by parameter s.
Parameters:

ts : TimeSeries

Series to be interpolated

times : time_interval or time_sequence

The new times to which the series will be interpolated. Can also be a string that can be parsed into a time interval.

filter_nan : boolean,optional

True if nan points should be omitted or not.

s : float, optional

Smoothing parameter. s=1.0e-3 by default. The lower s is, the closer the fitted curve will be to the data points, and the less smoother the filter will be. This is different from monotonic spline, currently a bit slower and perhaps more robust.

Returns:

result : TimeSeries

A regular time series if second input is time interval, or irregular time series otherwise.

monotonic_spline(ts, times, filter_nan=True)

Interpolation by monotonicity (shape) preserving spline. The spline will fit the data points exactly, derivatives are subject to slope limiters. The name ‘monotonicity-preserving’ can be a little deceptive – this is an excellent all around choice for interpolating continuous data.

Parameters:

ts : TimeSeries

Series to be interpolated

times : time_interval or time_sequence

The new times to which the series will be interpolated. Can also be a string that can be parsed into a time interval.

filter_nan: if nan points should be omitted or not. :

Returns:

result : TimeSeries

A regular time series if second input is time interval, or irregular time series otherwise.

nearest_neighbor(ts, times, filter_nan=True, **dic)
Interpolating series using nearest valid neighbor.
Parameters:

ts : TimeSeries

Series to be interpolated

times : time_interval or time_sequence

The new times to which the series will be interpolated. Can also be a string that can be parsed into a time interval.

filter_nan : boolean, optional

True if NaN point should be ommitted or not.

Returns:

result : TimeSeries

A regular time series if second input is time interval, :

or irregular time series otherwise.

gap_size(x)
previous_neighbor(ts, times, filter_nan=True, **dic)

Interpolate at time series using value at the previous valid neighbor

Parameters:

ts : TimeSeries

Series to be interpolated

times : time_interval or time_sequence

The new times to which the series will be interpolated. Can also be a string that can be parsed into a time interval.

filter_nan : boolean,optional

True if nan points should be omitted or not.

Returns:

result : TimeSeries

A regular time series if second input is time interval, :

or irregular time series otherwise.

next_neighbor(ts, times, filter_nan=True, **dic)

Interpolate by next (forward in time) valid neighbors.

Parameters:

ts : TimeSeries

Series to be interpolated

times : time_interval or time_sequence

The new times to which the series will be interpolated. Can also be a string that can be parsed into a time interval.

filter_nan : boolean,optional

True if nan points should be omitted or not.

Returns:

result : TimeSeries

A regular time series if times is time interval or irregular time series otherwise.

interpolate_ts_nan(ts, method='linear', max_gap=0, **args)

Estimate missing values within a time series by interpolation.

Parameters:

ts : TimeSeries

The time series to be interpolated. Must be univariate.

method : {LINEAR,NEAREST,PREVIOUS,NEXT,SPLINE,MSPLINE,RHIST}

You can use our constants or the equivalent string as indicated below.

LINEAR or ‘linear’

Linear interpolation

NEAREST or ‘nearest’

Nearest value in time

PREVIOUS or ‘previous’

Previous step with data

NEXT or ‘next’

Next step with data

SPLINE or ‘spline’

numpy 1D spline

MSPLINE or ‘mspline’

Monotonicity (shape) preserving spline

RHIST or ‘rhist’

Rational histospline that conserves area under the curve.

max_gap: int,optional :

The maximum number of continous nan data to allow interpolation. By default 0, which means doing interpolation no matter how big is the continous nan block.

**args : dictonary

Extra keyword parameters required by the method.

Returns:

filled : TimeSeries

New series with missing values filled using method

interpolate_ts(ts, times, method='spline', filter_nan=True, **dic)

Interpolate a time series to a new time sequence.

Parameters:

ts : TimeSeries

Series to interpolate. Must has data of one dimension, regular or irregular.

times : time_interval or time_sequence

The new times to which the series will be interpolated. Can also be a string that can be parsed into a time interval.

method : string, optional

See interpolate_ts_nan() for a list of methods

filter_nan : boolean, optional

True if nan should be omitted or not. If retained, nan values in the source series will be used in interpolation algorithm, which may cause new nan points in resulting ts.

**dic : dictonary

Extra parameters.

Returns:

result : TimeSeries

A regular or irregular series with times based on times and values interpolated from ts.

See also

interpolate_ts_nan
Fill internal nan values using interpolation
rhistinterp(ts, interval, **dic)

Interpolating a regular time series (rts) to a finer rts by rational histospline.

The rational histospline preserves area under the curve. This is a good choice of spline for period averaged data where an interpolant is desired that is ‘conservative’. Note that it is the underlying continuous interpolant that will be ‘conservative though, not the returned discrete time series which merely samples the underlying interpolant.

Parameters:

ts : TimeSeries

Series to be interpolated, typically period averaged

interval : time_interval

p : float, optional

Spline tension, usually between 0 and 20. Must >-1. For a ‘sufficiently large’ value of p, the interpolant will be monotonicity-preserving and will maintain strict positivity (always being strictly > lowbound.

lowbound : float, optional

Lower bound of interpolated values.

tolbound : float, optional

Tolerance for determining if an input is on the bound.

Returns:

result : TimeSeries

A regular time series with instantaneous values

vtools.functions.merge module

Merge multiple time series ts1,ts2 ..., a new time series with combination of ts1,ts2 data will be returned.

merge(ts0, ts1, *lts)

merge a number of timeseries together and return a new ts.

Parameters:

ts0 : TimeSeries

Highest priority time series.

ts1 : TimeSeries

Next highest priority time series, used to replace nans in ts0.

*other_ts : TimeSeries

Additional regular time series arguments (number of args is variable) to merge with ts0 to cover nan data. Each must have with the same interval and should be listed in decreasing priority

Returns:

merged : TimeSeries

A new time series with time interval same as the inputs, time extent the union of the inputs, and filled first with ts1, then with remaining gaps filled with ts2, then ts3....

vtools.functions.period_op module

Period operation over a regular time series, the returned new time series are aligned with operation interval( e.g. monthly data will start on the first day of the month).

period_op(ts, interval, op, method=None)

Apply a period operation on a time series.

Note

Only valid data of input ‘ts’ will be considered. The ‘method’ input is only meaningful for averaging operation, if not given, function will select one method based on the timestamp property of time seris.

Parameters:

ts : TimeSeries

A regular timeseries to be operated.

interval : time_interval

Interval of operation.

op: string

three optiona are avaible, MAX,MIN and MEAN (in vtools.data.constants )

method : string, optional

the averaging method going to be used, Two option avaiable, TRAPEZOID and RECT

Returns:

Result : TimeSeries

a new time series with period interval that is the result of applying op consecutively over calendar periods

Raises:

error : ValueError

If input time series is not regular, or regular interval is calendar dependent, or input downsampling interval is narrower than original one.

period_min(ts, interval)

Find a period minimum values on a time series.

Parameters:

ts : TimeSeries

A regular timeseries to be operated.

interval : time_interval

Interval of operation.

Returns:

Result : TimeSeries

a new time series with period interval that is the result of applying min operation consecutively over calendar periods

Raises:

error : ValueError

If input time series is not regular, or regular interval is calendar dependent, or input downsampling interval is narrower than original one.

period_max(ts, interval)

Find a period maximum on a time series.

Parameters:

ts : TimeSeries

A regular timeseries to be operated.

interval : time_interval

Interval of operation.

Returns:

Result : TimeSeries

a new time series with period interval that is the result of applying min operation consecutively over calendar periods

Raises:

error : ValueError

If input time series is not regular, or regular interval is calendar dependent, or input downsampling interval is narrower than original one.

period_ave(ts, interval, method=None)

Find a period mean on a time series.

Parameters:

ts : TimeSeries

A regular timeseries to be operated.

interval : time_interval

Interval of operation.

Returns:

Result : TimeSeries

a new time series with period interval that is the result of applying average operation consecutively over calendar periods

Raises:

error : ValueError

If input time series is not regular, or regular interval is calendar dependent, or input downsampling interval is narrower than original one.

period_sum(ts, interval)

Find a period sum on a time series.

Parameters:

ts : TimeSeries

A regular timeseries to be operated.

interval : time_interval

Interval of operation.

Returns:

Result : TimeSeries

a new time series with period interval that is the result of applying sum operation consecutively over calendar periods

Raises:

error : ValueError

If input time series is not regular, or regular interval is calendar dependent, or input downsampling interval is narrower than original one.

vtools.functions.resample module

Resample operation over a time series, a new time series with the desired interval and corresponding ticks will be returned.

resample(ts, interval, aligned=True)

Do a resampling operation on a time series.

Parameters:

ts : TimeSeries

A regular timeseries to be resampled.

interval : time_interval

Interval of resampled time series.

aligned: boolean :

Default is true which means aligning result’ time with input ‘interval’.

Returns:

Resampled : TimeSeries

A new resampled time series if success.

Raises:

error : ValueError

If input time series is not regular, or regular interval is calendar dependent.

decimate(ts, interval, **dic)
Downsample timeseries by decimating.
A wrapper of GNU Octave function decimate.m function. This will remove the influence of aliasing on downsampled ts.
Parameters:

ts : TimeSeries

A regular timeseries to be resampled.

interval : time_interval

Interval of resampled time series.

**dic : Dictionary

Dictonary of extra parameters to be passed in. For instance , input ‘align_calendar’ as boolean to align resampled timeseries with the downsampling interval.

Returns:

Resampled : TimeSeries

A new resampled time series if success.

vtools.functions.rhist module

Module for data interpolation using rational histosplines. The main algorithm is a numpy re-implementation of rhist2 in

One Dimensional Spline Interpolation Algorithms by Helmuth Spath (1995).

The benefit of histosplines is that they are designed to conserve the integral under the curve, and so they are good for period averaged data when you want the interpolator to conserve mass.

rhist(x, y, xnew, y0, yn, p, q)

Histopline for arrays with tension. Based by an algorithm rhist2 in One Dimensional Spline Interpolation Algorithms by Helmuth Spath (1995).

Parameters:

x : array-like

Abscissa array of original data, of length n

y : array-like, dimension (n-1)

Values (mantissa) of original data giving the rectangle (average) values between x[i] and x[i+1]

xnew : array-like

Array of new locations at which to interpolate.

y0,yn : float

Initial and terminal values

p,q: array-like, dimension (n-1) :

Tension parameter, p and q are almost always the same. The higher p and q are for a particular x interval, the more rectangular the interpolant will look and the more positivity and shape preserving it is at the expense of accuracy. For this routine any number p,q > -1 is allowed, although the bound routine doesn’t use vals less than zero.

Returns :

——- :

ynew : array-like

Array that interpolates the original data.

rhist_bound(x, y, xnew, y0, yn, p, lbound=None, maxiter=5, pfactor=2, floor_eps=0.001)

Histopline for arrays with lower bound enforcement. This routine drives rhist() but tests that the output array observes the lower bound and adapts the tension parameters as needed.

This will not work exactly if the input array has values right on the lower bound. In this case, the parameter floor_eps allows you to specify a tolerance of bound violation to shoot for ... and if it isn’t met in maxiter iterations the value is simply floored.

Parameters:

x : array-like

Abscissa array of original data to be interpolated, of length n

y : array-like, dimension (n-1)

Values (mantissa) of original data giving the rectangle (average) values between x[i] and x[i+1]

xnew : array-like

Array of new locations at which to interpolate.

y0,yn : float

Initial and terminal values

p: float :

Tension parameter. This starts out as a global scalar, but will be converted to an array and adapted locally. The higher this goes for a particular x interval, the more rectangular the interpolant will look and the more positivity and shape preserving it is at the expense of accuracy. A good number is 1, and for this routine, p > 0 is required because the adaptive process multiplies it by pfactor each iteration on the expectation that it will get bigger.

lbound: float :

Lower bound to be enforced. If the original y’s are strictly above this value, the output has the potential to also be strictly above. If the original y’s lie on the lower bound, then the lower bound can only be enforced within a tolerance using the Spath algorithm ... and once the values reach that tolerance they are floored. If lbound = None, this function behaves like rhist()

maxiter : integer

Number of times to increase p by multiplying it by pfactor before giving up on satisfying floor_eps.

pfactor : float

Factor by which to multiply individual time step p

floor_eps : float

Tolerance for lower bound violation at which the algorithm will be terminated and the bounds will be enforced by flooring.

Returns:

ynew : array-like

Array that interpolates the original data, on a curve that conserves mass and strictly observes the lower bound.

rhist_coef(x, y, y0, yn, p, q)

Routine that produces coefficients for the histospline

rhist_val(xnew, x, p, q, a, b, c)

Evaluate a histospline at new x points

tridiagonal(a, b, c, d)

a is the lower band (with leading zero) b is the center diagonal (length == nrow) c is upper band (trailing zero) d is right hand side

vtools.functions.shift module

Shift operations over a time series, original time series with shifted start_time,end_time and times is return.

shift(ts, interval)

Shift entire time series by a given interval

Parameters:

ts : TimeSeries

A regular timeseries to be shifted.

interval : time_interval

Interval of the shifting.

Returns:

shifted : TimeSeries

A new time series with shifted times.

vtools.functions.skill_metrics module

calculate_lag(a, b, time_window, max_shift, period=None, resolution=datetime.timedelta(0, 60))

Calculate lag of the second time series b, that maximizes the cross-correlation with a.

Parameters:

a,b : vtools.data.timeseries.Timeseries

Two time series to compare. The time extent of b must be the same or a superset of that of a.

time_window : time_window

Tuple representing start and end time of cross-correlation analysis

max_shift : interval

Maximum pos/negative time shift to consider in cross-correlation (ie, from -max_shift to +max_shift)

period : interval, optional

Period that will be used to further clip the time_window of analysis to fit a periodicity.

Returns:

lag : datetime.timedelta

lag

corr_coefficient(predictions, targets, *args, **kwargs)

Calculates the correlation coefficient (the ‘r’ in ‘-squared’ between two series.

For time series where the targets are serially correlated and may span only a fraction of the natural variability, this statistic may not be appropriate and Murphy (1988) explains why caution should be exercised in using this statistic.

Parameters:

predictions, targets : array_like

Time series to analyze

bias : boolean

Whether to use the biased (N) or unbiased (N-1) sample size for normalization

Returns:

r : float

Correlation coefficient

median_error(predictions, targets, *args, **kwargs)

Calculate the median error, discounting nan values

Parameters:

predictions, targets : array_like

Time series or arrays to be analyzed

Returns:

med : float

Median error

mse(predictions, targets, *args, **kwargs)

Mean squared error

Parameters:

predictions, targets : array_like

Time series or arrays to analyze

Returns:

mse : vtools.data.timeseries.TimeSeries

Mean squared error between predictions and targets

rmse(predictions, targets, *args, **kwargs)

Root mean squared error

Parameters:

predictions, targets : array_like

Time series or arrays to analyze

Returns:

mse : float

Mean squared error

skill_score(predictions, targets, *args, **kwargs)

Calculate a Nash-Sutcliffe-like skill score based on mean squared error

As per the discussion in Murphy (1988) other reference forecasts (climatology, harmonic tide, etc.) are possible.

Parameters:

predictions, targets : array_like

Time series or arrays to be analyzed

Returns:

rmse : float

Root mean squared error

tmean_error(predictions, targets, *args, **kwargs)

Calculate the (possibly trimmed) mean error, discounting nan values

Parameters:

predictions, targets : array_like

Time series or arrays to be analyzed

limits : tuple(float)

Low and high limits for trimming

inclusive : [boolean, boolean]

True if clipping is inclusive on the low/high end

Returns:

mean : float

Trimmed mean error

ts_data_arg(f)

Decorator used to ensure that functions can be used on time series or arrays

vtools.functions.ts_ufunc module

ts_accumulate(ts, ufunc)

Apply ufunc.accumulate to the data and produce a neatened time series The function will be applied cumulatively to the data. So... ts_apply(ts, add) will produce a time series, where each entry is the cumulative sum up to that index.

ts_apply(ts, ufunc, other=None)

Apply a numpy ufunc to the data and produce a neatened time series The function will be applied pointwise to the data. The argument other must be a scalar and serves as the second argument to a binary operator or function.

So... ts_apply(ts, add, 5) will add 5 to every data member

ts_max(ts)

Return the global (scalar) maximum value of ts

ts_maximum(ts, other)

Return the pointwise maximum of the ts and a scalar or another time series

ts_mean(ts)

todo: figure out time weight

ts_min(ts)

Return the global (scalar) minimum value of ts

ts_minimum(ts, other)

Return the pointwise minimum of the ts and a scalar

ts_reduce(ts, ufunc)

Reduce the data by applying ufunc.reduce and produce a neatened time series The function will be applied cumulatively to the data, producing a “total”. The output of ts_reduce is the same as the size of one element in the time series. So... ts_reduce(ts, add) will produce the sum of a time series. Some convenience functions such as ts_sum, ts_mean, ts_max, ts_min are provided that rely on this function.

ts_sum(ts, time_weight=False)

Return the sum of the time series from start to end todo: figure out time weights, figure out trapezoidal part

Module contents