toad.clustering.optimizing

Module Attributes

default_opt_params

Cluster Scoring Methods Heaviside Score (score_heaviside): - Evaluates how closely a cluster's aggregated time series resembles a perfect step function - Uses linear regression residuals compared to theoretical Heaviside function - Score of 1 = perfect step function, 0 = linear trend

Functions

combined_spatial_nonlinearity(td, ...[, weights])

Compute a weighted combination of spatial autocorrelation and nonlinearity scores.

toad.clustering.optimizing.combined_spatial_nonlinearity(td, cluster_variable, weights=[1, 1])

Compute a weighted combination of spatial autocorrelation and nonlinearity scores.

Parameters:
  • td – ToadDataset object containing the data

  • cluster_ids – List of cluster IDs to evaluate

  • var – Name of variable to analyze

  • weights – List of two weights for spatial and nonlinearity scores. Defaults to [1,1]

Returns:

Weighted sum of spatial autocorrelation and nonlinearity scores

Return type:

float

toad.clustering.optimizing.default_opt_params = {'min_cluster_size': (10, 25), 'time_weight': (0.5, 1.5)}

Cluster Scoring Methods Heaviside Score (score_heaviside): - Evaluates how closely a cluster’s aggregated time series resembles a perfect step function - Uses linear regression residuals compared to theoretical Heaviside function - Score of 1 = perfect step function, 0 = linear trend

Consistency Score (score_consistency): - Measures internal coherence using hierarchical clustering of time series correlations - Converts R² correlations to distances, performs Ward linkage - Higher scores indicate more consistent temporal behavior within clusters

Spatial Autocorrelation Score (score_spatial_autocorrelation): - Computes average pairwise R² correlations between all time series in a cluster - Measures spatial coherence and synchronization of cluster behavior

Nonlinearity Score (score_nonlinearity): - Measures RMSE deviation from linear fit - Optional normalization against unclustered data to identify clusters that stand out - Supports various spatial aggregation methods (mean, median, percentile, etc.)

You can also define your own objective function by passing a function which takes (td, cluster_variable) as arguments. Then you just compute your score and return it. See combined_spatial_nonlinearity() above for an example.