deepof.post_hoc.train_supervised_cluster_detectors

deepof.post_hoc.train_supervised_cluster_detectors(chunk_stats: DataFrame, hard_counts: ndarray, sampled_breaks: dict, n_folds: int | None = None, verbose: int = 1)

Train supervised models to detect clusters from kinematic features.

Parameters:
  • chunk_stats (pd.DataFrame) – table with descriptive statistics for a series of sequences (‘chunks’).

  • hard_counts (np.ndarray) – cluster assignments for the corresponding ‘chunk_stats’ table.

  • sampled_breaks (dict) – sequence length of each chunk per experiment.

  • n_folds (int) – number of folds for cross validation. If None (default) leave-one-experiment-out CV is used.

  • verbose (int) – verbosity level. Must be an integer between 0 (nothing printed) and 3 (all is printed).

Returns:

trained supervised model on the full dataset, mapping chunk stats to cluster assignments. Useful to run the SHAP explainability pipeline. cluster_gbm_performance (dict): cross-validated dictionary containing trained estimators and performance metrics. groups (list): cross-validation indices. Data from the same animal are never shared between train and test sets.

Return type:

full_cluster_clf (imblearn.pipeline.Pipeline)