deepof.post_hoc.chunk_cv_splitter

deepof.post_hoc.chunk_cv_splitter(chunk_stats: DataFrame, breaks: dict, n_folds: int | None = None)

Split a dataset into training and testing sets, grouped by video.

Given a matrix with extracted features per chunk, returns a list containing a set of cross-validation folds, grouped by experimental video. This makes sure that chunks coming from the same experiment will never be leaked between training and testing sets.

Parameters:
  • chunk_stats (pd.DataFrame) – matrix with statistics per chunk, sorted by experiment.

  • breaks (dict) – dictionary containing ruprures per video.

  • n_folds (int) – number of cross-validation folds to compute.

Returns:

list containing a training and testing set per CV fold.