deepof.post_hoc.chunk_cv_splitter
- deepof.post_hoc.chunk_cv_splitter(chunk_stats: DataFrame, bin_info: dict, n_folds: int | None = None)
Split a dataset into training and testing sets, grouped by video.
Given a matrix with extracted features per chunk, returns a list containing a set of cross-validation folds, grouped by experimental video. This makes sure that chunks coming from the same experiment will never be leaked between training and testing sets.
- Parameters:
chunk_stats (pd.DataFrame) – matrix with statistics per chunk, sorted by experiment.
bin_info (dict) – A dictionary containing start and end positions or indices of all sections for given embeddings
n_folds (int) – number of cross-validation folds to compute.
- Returns:
list containing a training and testing set per CV fold.