my_code_base.stats.timeseries

Attributes

decorrelation_timescale

Classes

IntegralTimescaleResult

Functions

effective_sample_size(x, y)

Calculate the effective sample size of two time series based on the lag-1 autocorrelation.

extend_annual_series(ds)

Fill a time series with only annual values (one such timeseries could be generated

integral_timescale(data[, dt])

Calculate the integral timescale of decorrelation of a time series.

lag1_autocorrelation(x)

Calculate the lag-1 autocorrelation of a time series.

ndof_integral_timescale(data[, dt])

Calculate the number of degrees of freedom (dof) of the integral

ndof_lag1_autocorrelation(x, y)

Calculate the number of degrees of freedom (ndof) based on the lag-1 autocorrelation

pd_seasonal_decompose(x[, freq])

Decompose a time series into its trend, the seasonality, and the residuals.

weighted_annual_mean(ds)

Compute the weighted annual mean of an xarray.Dataset or xarray.DataArray.

xr_autocorr(x[, dim, normalize, new_dim])

Calculate the autocorrelation of a time series.

xr_deseasonalize(da[, freq, dim])

Remove the seasonal cycle of an xr.Dataset object.

xr_seasonal_decompose(da[, dim])

Perform seasonal decomposition of a time series using the given dataset.

zero_crossings(x)

Find the zero crossings of a time series.

Module Contents

class my_code_base.stats.timeseries.IntegralTimescaleResult
dof
timescale
my_code_base.stats.timeseries.effective_sample_size(x, y)[source]

Calculate the effective sample size of two time series based on the lag-1 autocorrelation.

Parameters:
x : np.ndarray

The first time series data.

y : np.ndarray

The second time series data.

Returns:

The effective sample size of the two time series.

Return type:

float

my_code_base.stats.timeseries.extend_annual_series(ds)[source]

Fill a time series with only annual values (one such timeseries could be generated via weighted_annual_mean(), for example) such that all months are represented again but the value for all 12 months within a year is equal to the annual value.

Parameters:
ds : xarray.Dataset

The input dataset containing the time series data.

Returns:

The extended time series dataset with monthly values.

Return type:

xarray.Dataset

Raises:

AssertionError – If the dataset does not have ‘year’ as a dimension.

Example

>>> ds = xr.Dataset({'time': pd.date_range('2000-01-01', '2001-12-31', freq='ME'),
...                  'value': np.random.rand(24)})
>>> extended_ds = extend_annual_series(ds)
my_code_base.stats.timeseries.integral_timescale(data, dt=1)[source]

Calculate the integral timescale of decorrelation of a time series.

Parameters:
data : np.ndarray

The input data array containing the time series. Must be NaN-free; pass pre-cleaned data to avoid destroying temporal structure.

dt : float

The time step of the data.

Returns:

The integral timescale of the time series.

Return type:

float

Raises:

ValueError – If the input data contains NaN values.

Notes

The integral timescale is calculated as the integral of the autocorrelation function (ACF) of the time series up to the first zero crossing.

my_code_base.stats.timeseries.lag1_autocorrelation(x)[source]

Calculate the lag-1 autocorrelation of a time series.

my_code_base.stats.timeseries.ndof_integral_timescale(data, dt=1)[source]

Calculate the number of degrees of freedom (dof) of the integral timescale of decorrelation of a time series.

Parameters:
data : np.ndarray

The input data array containing the time series.

dt : float

The time step of the data.

Returns:

Named tuple with fields timescale (the integral timescale) and dof (the degrees of freedom).

Return type:

IntegralTimescaleResult

Notes

The effective sample size is calculated as n * dt / τ following Emery & Thomson (2004), eq. (3.15.17). It is clamped to [2, n] to ensure valid degrees of freedom.

my_code_base.stats.timeseries.ndof_lag1_autocorrelation(x, y)[source]

Calculate the number of degrees of freedom (ndof) based on the lag-1 autocorrelation of two time series.

Parameters:
x : 1-D array-like

The first time series data.

y : 1-D array-like

The second time series data.

Returns:

dof – The number of degrees of freedom.

Return type:

float

my_code_base.stats.timeseries.pd_seasonal_decompose(x, freq=12)[source]

Decompose a time series into its trend, the seasonality, and the residuals.

Parameters:
x : pandas.Series

A pandas.Series containing a time series of data

freq : int

The frequency of the data, e.g. 12 for monthly data

Returns:

  • A pandas.DataFrame containing time series of the raw data, trend,

  • seasonality, the detrended time series, and the residuals.

my_code_base.stats.timeseries.weighted_annual_mean(ds: xarray.Dataset | xarray.DataArray)[source]

Compute the weighted annual mean of an xarray.Dataset or xarray.DataArray.

Parameters:
ds : xarray.Dataset | xarray.DataArray

The input dataset or data array.

Returns:

The weighted annual mean of the input dataset or data array.

Return type:

xarray.DataArray

Raises:

AssertionError – If the sum of the weights in each year is not equal to 1.0.

Notes

The function computes the annual mean of the input dataset or data array, taking into account the different lengths of the months. Each month is weighted by the number of days it comprises. If the frequency of the time dimension is ‘1M’, the function applies the weights. If the frequency is ‘1D’ or higher, no weights are applied.

The function follows the approach described in the following source: https://ncar.github.io/esds/posts/2021/yearly-averages-xarray/

The function assumes that the input dataset or data array has a ‘time’ dimension.

my_code_base.stats.timeseries.xr_autocorr(x, dim='time', normalize=True, new_dim='lead')[source]

Calculate the autocorrelation of a time series.

Parameters:
x : xarray.DataArray

The input data array containing the time series.

dim : str, optional

The dimension along which to calculate the autocorrelation. Defaults to ‘time’.

normalize : bool, optional

Whether to normalize the autocorrelation. Defaults to True.

new_dim : str, optional

The name of the new dimension. Defaults to ‘lead’.

Returns:

The autocorrelation of the time series.

Return type:

xarray.DataArray

my_code_base.stats.timeseries.xr_deseasonalize(da, freq=12, dim='time')[source]

Remove the seasonal cycle of an xr.Dataset object. Data get first detrended, then the long-term average of every season is subtracted for each season. Finally, the trend is added again.

Parameters:
freq : int

The frequency of the data. Default is 12 for monthly resolution.

dim : str

The name of the time dimension.

my_code_base.stats.timeseries.xr_seasonal_decompose(da, dim='time')[source]

Perform seasonal decomposition of a time series using the given dataset.

Parameters:
da : xarray.DataArray

The input data array containing the time series.

dim : str

The dimension along which the decomposition is performed. Default is ‘time’.

Returns:

A new dataset containing the decomposed components: trend, detrended, seasonality, residuals, and deseasonalized.

Return type:

xarray.Dataset

my_code_base.stats.timeseries.zero_crossings(x)[source]

Find the zero crossings of a time series.

Example

>>> x = np.array([1, 2, -1, -2, 1, 2])     # crossing at 2 -> -1 and -2 -> 1
>>> zero_crossings(x)
array([1, 3])
my_code_base.stats.timeseries.decorrelation_timescale