Formatting your data: feature extraction from motion tracking output

What we’ll cover:

Create and run a project.
Load a previously generated project.
Interact with your project: generate coordinates, distances, angles, and areas.
Exploratory visualizations: heatmaps and processed animations.

[1]:

# # If using Google colab, uncomment and run this cell and the one below to set up the environment
# # Note: because of how colab handles the installation of local packages, this cell will kill your runtime.
# # This is not an error! Just continue with the cells below.
# import os
# !git clone -q https://github.com/mlfpm/deepof.git
# !pip install -q -e deepof --progress-bar off
# os.chdir("deepof")
# !curl --output tutorial_files.zip https://datashare.mpcdf.mpg.de/s/Hu1XjZkY9zml0mm/download
# !unzip tutorial_files.zip

[2]:

# import os
# os.chdir("deepof")
# import os, warnings
# warnings.filterwarnings('ignore')

Let’s start by importing some packages. We’ll use python’s os library to handle paths, pickle to load saved objects, pandas to load data frames, and the data entry API within DeepOF, located in deepof.data

[3]:

import os
import pickle
import deepof.data

We’ll also need some plotting gear:

[4]:

import deepof.visuals
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

Creating and running a project

With that out of the way, the first thing to do when starting a DeepOF project is to load your videos and DeepLabCut tracking files into a deepof.data.Project object.

Like depicted in the cell below, the three crucial parameters to input are the project, video and tab paths.

project_path specifies where the project folder containing all processing and output files will be created.
video_path points towards where your DLC or SLEAP labelled videos are stored.
similarly, table_path should point to the directory containing all tracking files (see section on supported input files below).
last but not least, you can give your project a name (optional; deepof_project by default).

The dataset used in this tutorial is a subset of the social interaction (SI) dataset used in DeepOF’s original paper. It contains six 10-minute-long videos with two animals (a C57Bl6 and a CD1) tracked in a round arena, tracked with DeepLabCut. Three of the C57Bl6 mice have been exposed to chronic social defeat stress (CSDS).

[5]:

my_deepof_project_raw = deepof.data.Project(
    project_path=os.path.join("tutorial_files"),
    video_path=os.path.join("tutorial_files/Videos/"),
    table_path=os.path.join("tutorial_files/Tables/"),
    project_name="deepof_tutorial_project",
    arena="circular-autodetect",
    animal_ids=["B", "W"],
    table_format="h5",
    video_format=".mp4",
    exclude_bodyparts=["Tail_1", "Tail_2", "Tail_tip"],
    video_scale="380 mm",
    iterative_imputation="partial",  # "full",
    smooth_alpha=1,
    exp_conditions=None,
    # number_of_rois=3, # another optional input argument that allows you to define up to 20 different regions of interest during project creation
)

Info! Set arena dimension to 380.0 mm!

The sampling rate of some of your videos differ. The maximum difference is 20191204_Day2_SI_JB08_Test_54: 24.99 fps and 20191204_Day2_SI_JB08_Test_63: 24.994 fps! Proceed with cauthion

As you may see, there are some extra (optional) parameters in the call above. These specify some details on how the videos and tracks will be processed. Some include:

arena: some functions within DeepOF (as you will see in the next tutorial on supervised annotations) require the program to detect the arena. The following section explains how to handle this step in more detail.
animal_ids: in case more than one animal is present in the dataset, the IDs assigned to each during tracking need to be specified as a list. For SLEAP projects, this will be detected automatically, but can be overriten.
exclude_body_parts: while DeepOF originally relies on 14 body part tracking models, body parts along the tail are no longer used. This step could be bypassed using smaller models following an 11-body-part scheme, such as the one presented in the landing page of the documentation. Custom labelling schemes are also supported (see tutorials).
video_scale: diameter of the arena (in mm) if circular. In the case of polygonal arenas, the length (in mm) of the first edge as specified in the GUI (see next section) should be provided.
smooth_alpha: smoothing intensity. The higher the value, the more smoothing is applied.
exp_conditions: dictionary with a video IDs as keys, and data frames with all experimental conditions as values. DeepOF will use this information to enable all sorts of comparisons between conditions, as we will see in the following tutorials. We’ll leave it blank for now and update it afterwards.

For more details, feel free to visit the full API reference.

Let’s then create our first project, by running the .create() method in our newly created object:

[6]:

my_deepof_project = my_deepof_project_raw.create(force=True)

Setting up project directories...

Preprocessing tables          : 100%|██████████| 6/6 [00:02<00:00,  2.08it/s, file=20191204_D..., step=Saving data]
Detecting arenas              : 100%|██████████| 6/6 [01:29<00:00, 14.99s/arena]
Rescaling tables              : 100%|██████████| 6/6 [00:00<00:00, 556.74table/s]
Computing distances           : 100%|██████████| 6/6 [00:00<00:00,  8.14table/s]
Computing angles              : 100%|██████████| 6/6 [00:00<00:00, 49.80table/s]
Computing areas               : 100%|██████████| 6/6 [00:00<00:00,  6.91table/s]

Done!

NOTE: the force parameter allows DeepOF to override an existing project in the same directory. Use with caution!

NOTE 2: the cell above can take a significant amount of time to run in Google colab. Feel free to skip and continue with the loaded results below.

As you will see, this organizes all required file into a folder in the specified path. Moreover, some processing steps are computed on the go, such as distances, angles and areas. This makes it easier for DeepOF to load all required features later on, without the need to compute them every time.

Arena detection

As mentioned above, one of the key aspects of project creation involves setting up the arena detection mechanism, which will be used in downstream tasks, such as climbing detection and overall video scaling.

You can contrAlong these lines, the package provides tools for detecting elliptical arenas specifically (as seen in this tutorial), and general polygonal shapes (such as squares, rectangles, Y-mazes, etc).

In principle, DeepOF can detect your arenas automatically by relying in SAM (Segment Anything Model), a state-of-the-art image segmentation deep learning model. By selecting arena='circular-autodetect', or arena='polygonal-autodetect', users can benefit from this approach. A folder named Arena_detection will be created in your project directory, which contains samples of the detected arenas for all videos, so you can visually inspect if DeepOF did a good job.

Moreover, manual annotation can be selected using arena='circular-manual', and arena='polygonal-manual'. In this case, you will see a window per video appear, allowing you to manually mark where the arena is with just a few clicks. All results will be stored for further processing.

Here, clicking anywhere in the video will make an orange marker appear. In the case of polygonal arenas, at least a marker per corner should be used. When dealing with circular (or elliptical) arenas, DeepOF will fit an ellipse to the marked points after a minimum of 5 clicks. The ellipse can always be refined by adding more markers. Once finished with a video, press q to save and move to the next one. If you made a mistake and would like to correct it, press d to delete the last added marker. Moreover, after you tagged at least one video, you’ll see the p option appear, which will copy the last marked arena to all remaining videos in the dataset.

NOTE: If you select arena='polygonal-autodetect', you will still be prompted with the GUI to select the arena just once, so SAM roughly knows how the segmentation it should return looks like. Detection in all remaining videos will be automatic. This is not the case for arena='circular-autodetect', as we already know we’re looking for a circle.

arena_GUI

One last thing is that, in polygonal arenas, the first edge you input will be used to scale the coordinates to the proper distances (pixels to millimeters). Make sure you always mark the same edge first, and that it coincides with the length passed to the “video_scale” parameter when creating the project. In saved images of automatically detected polygons, you’ll see orange circles marking two arena corners. These should match the first edge you selected when prompting SAM, ensuring data is properly scaled.

If after all this work you make a mistake while labelling, or see that automatic detection failed in some cases, don’t worry! You can always edit manually annotated arenas (or rerun automatic annotation) for specific videos using the .edit_arenas() method. Just pass a list with the IDs of the videos you would like to relabel, and the GUI will pop up once again. The same methods described above can be passed to the arena_type parameter.

NOTE: If you don’t pass any video IDs, all videos will be selected by default.

[7]:

my_deepof_project.edit_arenas(
    video_keys=['20191204_Day2_SI_JB08_Test_54', '20191204_Day2_SI_JB08_Test_62'],
    arena_type="circular-autodetect",
)

Editing 2 arenas

Detecting arenas              : 100%|██████████| 2/2 [00:02<00:00,  1.16s/arena]

Done!

Supported input tracking files:

DeepOF currently supports input from both DeepLabCut and SLEAP. While DeepOF will try to detect automatically which file type you’re trying to use, this can be forced with the table_format parameter in the deepof.data.Project() call depicted in the previous section.

For DeepLabCut, we support:

Single and multi-animal project CSV files (indicated simply with table_format='csv'.
Single and multi-animal project h5 files (indicated simply with table_format='h5'.

For SLEAP, you can use:

SLP files (indicated as with table_format='slp'.
Raw .npy files (indicated with table_format='npy'.
h5 files (indicated with table_format=analysis.h5.

Downstream processing should be identical regardless of the files selected, as DeepOF internally brings all these inputs into an equivalent representation. Let’s continue!

Loading a previously generated project

Once you ran your project at least once, it can be loaded without much effort from the path you specified in the first place (plus the project name -deepof_project, if you didn’t specify otherwise-).

[8]:

# Load a previously saved project
my_deepof_project = deepof.data.load_project("./tutorial_files/deepof_tutorial_project/")

Extend a previously generated project

If you’d like to add data (videos and tracks) to a previously processed project, you can use the extend method instead of create. Just pass the path to your previously processed project, and DeepOF will take care of merging the two. A new directory will be created with all files corresponding to the merged projects.

NOTE: Your old files will not be deleted.

[9]:

# Extend a previously saved project
my_deepof_project=my_deepof_project_raw.extend(
    "./tutorial_files/deepof_tutorial_project/",)

Loading previous project...
Processing data from 0 experiments...

Interacting with your project: generating coordinates, distances, angles, and areas.

That’s it for basic project creation. We now have a DeepOF project up and running! Before ending the tutorial, however, let’s explore the object that the commands above produced.

For starters, if we print it, we see it’s a DeepOF analysis of 6 videos. Furthermore, the object belongs to a custom class within DeepOF, called Coordinates. This class allows the package to store all sorts of relevant information required for further processing, as we’ll see below:

[10]:

print(my_deepof_project)
print(type(my_deepof_project))

deepof analysis of 6 videos
<class 'deepof.data.Coordinates'>

As described before, the .create() method runs most of the heavy preprocessing already, which allows us to extract features including coordinates, distances, angles, and areas. Let’s see how that works!

preprocessing

With the .get_coords() method, for example, we can obtain the processed (smooth and imputed) tracks for all videos in a dictionary. The returned objects are called table dictionaries (TableDict is the name of the class). They follow a dictionary-like structure, where each value is a data frame. They also provide a plethora of extra methods, some of which we’ll cover in these tutorials. Let’s retrieve these for one of the animals:

[11]:

my_deepof_project.get_coords(polar=False, center="Center", align="Spine_1")['20191204_Day2_SI_JB08_Test_54']

[11]:

	B_Spine_1		B_Center		B_Left_bhip		B_Left_ear		B_Left_fhip		...	W_Right_bhip		W_Right_ear		W_Right_fhip		W_Spine_2		W_Tail_base
	x	y	x	y	x	y	x	y	x	y	...	x	y	x	y	x	y	x	y	x	y
00:00:00	0.0	13.325157	0.0	0.0	9.299380	-10.043816	0.078640	28.362360	11.113320	11.260838	...	-12.599950	-11.240405	-10.078637	43.152743	-11.573392	10.377854	0.642794	-17.147886	1.430863	-33.725658
00:00:00.040002666	0.0	13.325157	0.0	0.0	9.299380	-10.043816	0.078640	28.362360	11.113320	11.260838	...	-12.599950	-11.240405	-10.078637	43.152743	-11.573392	10.377854	0.642794	-17.147886	1.430863	-33.725658
00:00:00.080005333	0.0	13.325157	0.0	0.0	9.299380	-10.043816	0.078640	28.362360	11.113320	11.260838	...	-12.599950	-11.240405	-10.078637	43.152743	-11.573392	10.377854	0.642794	-17.147886	1.430863	-33.725658
00:00:00.120008	0.0	12.329721	0.0	0.0	8.271627	-11.473263	-0.290829	28.613626	12.340172	9.265053	...	-11.723291	-13.076021	-9.691817	42.347617	-11.806335	10.703766	1.061070	-17.959261	2.281181	-34.389720
00:00:00.160010667	0.0	12.184032	0.0	0.0	8.779927	-12.331783	0.698089	27.811298	12.141177	8.484510	...	-11.366711	-14.721331	-16.173151	40.094898	-12.072984	9.174599	1.584101	-18.188027	1.854913	-35.989727
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
00:09:59.799986665	0.0	14.084511	0.0	0.0	12.934637	-9.662336	8.634394	28.686438	11.999912	9.135109	...	-16.865593	-5.559382	-14.019423	25.202356	-9.893384	8.524650	-5.093286	-12.389494	-14.948009	-21.649407
00:09:59.839989332	0.0	12.209807	0.0	0.0	12.787276	-10.919776	5.867447	31.792569	12.005965	7.366316	...	-16.958799	-5.706550	-14.031198	25.126744	-9.933026	8.448160	-5.039922	-12.542700	-14.891456	-21.791515
00:09:59.879991999	0.0	15.747280	0.0	0.0	10.409651	-9.370631	6.176041	34.226407	12.033207	9.874782	...	-16.943784	-5.714167	-14.067525	25.075804	-9.916492	8.418803	-5.006992	-12.561188	-14.852969	-21.841110
00:09:59.919994666	0.0	15.747280	0.0	0.0	10.409651	-9.370631	6.176041	34.226407	12.033207	9.874782	...	-16.943784	-5.714167	-14.067525	25.075804	-9.916492	8.418803	-5.006992	-12.561188	-14.852969	-21.841110
00:09:59.959997333	0.0	15.747280	0.0	0.0	10.409651	-9.370631	6.176041	34.226407	12.033207	9.874782	...	-16.943784	-5.714167	-14.067525	25.075804	-9.916492	8.418803	-5.006992	-12.561188	-14.852969	-21.841110

14999 rows × 44 columns

Note that there are a few parameters you can pass to the .get_coords() method. If “polar” is set to True, polar coordinates will be used, instead of Cartesian. Both “center” and “align” control how translational and rotational variance are removed from the data: the former will use the specified body part as [0, 0] (or the center of the arena, if set to “arena”). The latter will rotate the mice on each time point, to align the line connecting the body parts specified in the “center” and “align” parameters with the y-axis.

Furthermore, not only processed coordinates can be retrieved, but also distances, angles, and areas with the .get_distances(), .get_angles(), and .get_areas(), respectively.

[12]:

my_deepof_project.get_distances()['20191204_Day2_SI_JB08_Test_54']

[12]:

	(W_Right_ear, W_Spine_1)	(B_Center, B_Spine_2)	(W_Center, W_Right_fhip)	(W_Left_bhip, W_Spine_2)	(B_Right_bhip, B_Spine_2)	(W_Right_bhip, W_Spine_2)	(W_Left_ear, W_Nose)	(B_Center, B_Left_fhip)	(B_Left_ear, B_Spine_1)	(B_Nose, B_Right_ear)	...	(B_Nose, W_Nose)	(W_Center, W_Spine_2)	(W_Center, W_Spine_1)	(B_Left_bhip, B_Spine_2)	(B_Nose, W_Tail_base)	(B_Left_ear, B_Nose)	(B_Center, B_Spine_1)	(B_Right_ear, B_Spine_1)	(W_Nose, W_Right_ear)	(W_Center, W_Left_fhip)
00:00:00	25.324534	13.836819	15.544879	14.450448	11.412604	14.500642	22.222081	15.821263	15.037409	15.057739	...	202.543916	17.159929	19.920155	14.600725	122.652715	15.170325	13.325157	14.429184	24.791305	20.121679
00:00:00.040002666	25.324534	13.836819	15.544879	14.450448	11.412604	14.500642	22.222081	15.821263	15.037409	15.057739	...	202.543916	17.159929	19.920155	14.600725	122.652715	15.170325	13.325157	14.429184	24.791305	20.121679
00:00:00.080005333	25.324534	13.836819	15.544879	14.450448	11.412604	14.500642	22.222081	15.821263	15.037409	15.057739	...	202.543916	17.159929	19.920155	14.600725	122.652715	15.170325	13.325157	14.429184	24.791305	20.121679
00:00:00.120008	24.165120	14.114840	15.936128	15.091938	11.977952	13.685244	20.982982	15.431172	16.286502	15.112614	...	201.439640	17.990579	20.211183	14.741844	122.474901	16.038285	12.329721	15.581831	24.705419	19.739289
00:00:00.160010667	25.009417	13.728727	15.163450	15.075836	12.857731	13.406770	21.167894	14.811991	15.642850	16.235095	...	200.696461	18.256880	21.018731	14.977542	121.775639	18.028338	12.184032	15.989769	22.790682	20.680558
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
00:09:59.799986665	17.263132	16.876614	13.059429	15.891442	13.647532	13.610203	23.477525	15.081383	16.963757	22.674734	...	148.312575	13.395564	15.129050	13.789807	191.389073	20.974630	14.084511	16.974595	19.277812	15.207663
00:09:59.839989332	17.257318	19.611009	13.039801	15.901995	13.324806	13.740181	23.502572	14.085660	20.442884	22.789394	...	144.670507	13.517401	15.079828	13.468828	186.729658	16.403314	12.209807	17.378500	19.322534	15.197649
00:09:59.879991999	17.240931	17.657619	13.008192	15.937232	13.673161	13.761130	23.463719	15.566290	19.483881	21.730108	...	146.851049	13.522330	15.108134	13.742712	186.008433	18.592350	15.747280	17.947809	19.336323	15.241505
00:09:59.919994666	17.240931	17.657619	13.008192	15.937232	13.673161	13.761130	23.463719	15.566290	19.483881	21.730108	...	146.851049	13.522330	15.108134	13.742712	186.008433	18.592350	15.747280	17.947809	19.336323	15.241505
00:09:59.959997333	17.240931	17.657619	13.008192	15.937232	13.673161	13.761130	23.463719	15.566290	19.483881	21.730108	...	146.851049	13.522330	15.108134	13.742712	186.008433	18.592350	15.747280	17.947809	19.336323	15.241505

14999 rows × 26 columns

[13]:

my_deepof_project.get_angles()['20191204_Day2_SI_JB08_Test_54']

[13]:

	(B_Right_fhip, B_Center, B_Spine_2)	(B_Right_fhip, B_Center, B_Spine_1)	(B_Right_fhip, B_Center, B_Left_fhip)	(B_Spine_2, B_Center, B_Spine_1)	(B_Spine_2, B_Center, B_Left_fhip)	(B_Spine_1, B_Center, B_Left_fhip)	(B_Spine_1, B_Left_ear, B_Nose)	(B_Left_bhip, B_Spine_2, B_Center)	(B_Left_bhip, B_Spine_2, B_Right_bhip)	(B_Left_bhip, B_Spine_2, B_Tail_base)	...	(W_Left_bhip, W_Spine_2, W_Right_bhip)	(W_Left_bhip, W_Spine_2, W_Tail_base)	(W_Center, W_Spine_2, W_Right_bhip)	(W_Center, W_Spine_2, W_Tail_base)	(W_Right_bhip, W_Spine_2, W_Tail_base)	(W_Center, W_Spine_1, W_Left_ear)	(W_Center, W_Spine_1, W_Right_ear)	(W_Left_ear, W_Spine_1, W_Right_ear)	(W_Spine_1, W_Right_ear, W_Nose)	(W_Left_ear, W_Nose, W_Right_ear)
00:00:00	1.832922	0.937362	1.716167	2.770285	2.734096	0.778805	2.020075	1.003026	2.381443	2.334903	...	2.058565	2.186722	1.113729	3.131558	2.037898	2.710147	2.732280	0.840758	2.234005	0.953038
00:00:00.040002666	1.832922	0.937362	1.716167	2.770285	2.734096	0.778805	2.020075	1.003026	2.381443	2.334903	...	2.058565	2.186722	1.113729	3.131558	2.037898	2.710147	2.732280	0.840758	2.234005	0.953038
00:00:00.080005333	1.832922	0.937362	1.716167	2.770285	2.734096	0.778805	2.020075	1.003026	2.381443	2.334903	...	2.058565	2.186722	1.113729	3.131558	2.037898	2.710147	2.732280	0.840758	2.234005	0.953038
00:00:00.120008	1.679362	0.989265	1.916045	2.668627	2.687778	0.926781	1.965510	1.023683	2.488609	2.435508	...	2.071008	2.202391	1.146916	3.126483	2.009787	2.728616	2.728912	0.825657	2.292104	0.982455
00:00:00.160010667	1.718482	0.954757	1.915619	2.673239	2.649084	0.960862	2.034714	1.107882	2.542842	2.416358	...	2.072768	2.362859	1.222370	3.069928	1.847558	2.963804	2.438366	0.881016	2.173166	1.044266
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
00:09:59.799986665	2.310751	0.899521	1.819645	3.072914	2.152789	0.920125	2.319123	1.092302	2.112860	1.856711	...	2.630330	2.372843	1.435111	2.715123	1.280012	2.958156	2.193847	1.131182	2.379175	0.923178
00:09:59.839989332	2.217032	1.046089	2.066558	3.020065	1.999595	1.020470	2.551768	1.004903	1.990132	1.987859	...	2.624470	2.384113	1.432123	2.706725	1.274602	2.961729	2.192207	1.129249	2.382483	0.921390
00:09:59.879991999	2.322632	0.787236	1.670839	3.109868	2.289714	0.883604	2.375213	0.892613	1.857147	2.173421	...	2.619428	2.387161	1.429316	2.705913	1.276597	2.965252	2.187243	1.130690	2.379106	0.923823
00:09:59.919994666	2.322632	0.787236	1.670839	3.109868	2.289714	0.883604	2.375213	0.892613	1.857147	2.173421	...	2.619428	2.387161	1.429316	2.705913	1.276597	2.965252	2.187243	1.130690	2.379106	0.923823
00:09:59.959997333	2.322632	0.787236	1.670839	3.109868	2.289714	0.883604	2.375213	0.892613	1.857147	2.173421	...	2.619428	2.387161	1.429316	2.705913	1.276597	2.965252	2.187243	1.130690	2.379106	0.923823

14999 rows × 36 columns

[14]:

my_deepof_project.get_areas()['20191204_Day2_SI_JB08_Test_54']

[14]:

	B_head_area	B_torso_area	B_back_area	B_full_area	W_head_area	W_torso_area	W_back_area	W_full_area
00:00:00	186.379602	286.877920	322.346263	1031.762726	403.882917	432.800448	457.418159	1759.904145
00:00:00.040002666	186.379602	286.877920	322.346263	1031.762726	403.882917	432.800448	457.418159	1759.904145
00:00:00.080005333	186.379602	286.877920	322.346263	1031.762726	403.882917	432.800448	457.418159	1759.904145
00:00:00.120008	220.588695	301.609605	325.358041	1099.050813	436.529889	458.405810	465.829889	1788.957183
00:00:00.160010667	222.261708	299.870852	335.142348	1120.455237	386.996017	468.328898	464.115153	1794.118011
...	...	...	...	...	...	...	...	...
00:09:59.799986665	239.351739	336.178202	369.570132	1236.107480	338.985789	306.291533	398.589194	1296.123235
00:09:59.839989332	225.022634	337.582968	356.664936	1202.726101	339.620045	307.614991	401.612088	1299.472641
00:09:59.879991999	254.932258	347.361788	366.539204	1240.352799	339.318555	307.496487	402.713951	1300.378274
00:09:59.919994666	254.932258	347.361788	366.539204	1240.352799	339.318555	307.496487	402.713951	1300.378274
00:09:59.959997333	254.932258	347.361788	366.539204	1240.352799	339.318555	307.496487	402.713951	1300.378274

14999 rows × 8 columns

Last but not least, features can be merged using the .merge() method, which can yield combinations of features if needed. For example, the code in the following cell creates an object with both coordinates and areas per time point:

[15]:

my_deepof_project.get_coords().merge(my_deepof_project.get_areas())['20191204_Day2_SI_JB08_Test_54']

[15]:

	(B_Center, x)	(B_Center, y)	(B_Left_bhip, x)	(B_Left_bhip, y)	(B_Left_ear, x)	(B_Left_ear, y)	(B_Left_fhip, x)	(B_Left_fhip, y)	(B_Nose, x)	(B_Nose, y)	...	(W_Tail_base, x)	(W_Tail_base, y)	B_head_area	B_torso_area	B_back_area	B_full_area	W_head_area	W_torso_area	W_back_area	W_full_area
00:00:00	178.695067	150.903525	178.133481	164.579831	160.236194	129.369839	162.932410	149.543005	166.288734	115.459213	...	284.276208	81.953327	186.379602	286.877920	322.346263	1031.762726	403.882917	432.800448	457.418159	1759.904145
00:00:00.040002666	178.695067	150.903525	178.133481	164.579831	160.236194	129.369839	162.932410	149.543005	166.288734	115.459213	...	284.276208	81.953327	186.379602	286.877920	322.346263	1031.762726	403.882917	432.800448	457.418159	1759.904145
00:00:00.080005333	178.695067	150.903525	178.133481	164.579831	160.236194	129.369839	162.932410	149.543005	166.288734	115.459213	...	284.276208	81.953327	186.379602	286.877920	322.346263	1031.762726	403.882917	432.800448	457.418159	1759.904145
00:00:00.120008	177.009679	149.650527	175.106624	163.666017	163.807835	124.262824	161.761899	147.278549	174.192032	112.040097	...	293.070473	82.577959	220.588695	301.609605	325.358041	1099.050813	436.529889	458.405810	465.829889	1788.957183
00:00:00.160010667	177.767518	147.833830	173.779126	162.437004	167.520924	121.969512	163.445137	144.056996	179.274246	108.299092	...	299.036433	86.246318	222.261708	299.870852	335.142348	1120.455237	386.996017	468.328898	464.115153	1794.118011
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
00:09:59.799986665	87.802400	245.478268	103.222546	240.694477	86.352888	275.400892	96.059204	258.098620	74.008872	292.358497	...	217.058911	419.505907	239.351739	336.178202	369.570132	1236.107480	338.985789	306.291533	398.589194	1296.123235
00:09:59.839989332	87.996004	251.604046	103.329278	244.701389	84.727063	283.767821	97.459237	262.037312	75.703366	297.466054	...	217.025629	419.515154	225.022634	337.582968	356.664936	1202.726101	339.620045	307.614991	401.612088	1299.472641
00:09:59.879991999	88.189609	257.729824	101.048434	252.178232	83.101238	292.134750	96.435058	270.932928	70.169244	305.492857	...	217.112048	419.540870	254.932258	347.361788	366.539204	1240.352799	339.318555	307.496487	402.713951	1300.378274
00:09:59.919994666	88.189609	257.729824	101.048434	252.178232	83.101238	292.134750	96.435058	270.932928	70.169244	305.492857	...	217.112048	419.540870	254.932258	347.361788	366.539204	1240.352799	339.318555	307.496487	402.713951	1300.378274
00:09:59.959997333	88.189609	257.729824	101.048434	252.178232	83.101238	292.134750	96.435058	270.932928	70.169244	305.492857	...	217.112048	419.540870	254.932258	347.361788	366.539204	1240.352799	339.318555	307.496487	402.713951	1300.378274

14999 rows × 52 columns

Loading experimental conditions

So far, DeepOF does not know to which condition each animal belongs. This can be either set up when creating the project (as described above) or specified afterward using the .load_exp_conditions() method. We just need to pass the path to a CSV file containing all conditions per animal as extra columns. The only hard requirement is that the first column should have the experiment IDs.

Here is an example:

[16]:

pd.read_csv("./tutorial_files/tutorial_exp_conditions.csv", index_col=0)

[16]:

	experiment_id	CSDS
0	20191204_Day2_SI_JB08_Test_54	Nonstressed
1	20191204_Day2_SI_JB08_Test_56	Stressed
2	20191204_Day2_SI_JB08_Test_61	Stressed
3	20191204_Day2_SI_JB08_Test_62	Stressed
4	20191204_Day2_SI_JB08_Test_63	Nonstressed
5	20191204_Day2_SI_JB08_Test_64	Nonstressed

Great! Now that we understand how the CSV should be formatted, let’s then load it onto our project:

[17]:

my_deepof_project.load_exp_conditions("./tutorial_files/tutorial_exp_conditions.csv")

And we’re done!. Let’s explore what’s in there with the .get_exp_conditions property:

[18]:

print(my_deepof_project.get_exp_conditions)

{'20191204_Day2_SI_JB08_Test_54':           CSDS
0  Nonstressed, '20191204_Day2_SI_JB08_Test_56':        CSDS
1  Stressed, '20191204_Day2_SI_JB08_Test_61':        CSDS
2  Stressed, '20191204_Day2_SI_JB08_Test_62':        CSDS
3  Stressed, '20191204_Day2_SI_JB08_Test_63':           CSDS
4  Nonstressed, '20191204_Day2_SI_JB08_Test_64':           CSDS
5  Nonstressed}

We can see that the property retrieves a dictionary with all animal experiments as keys, and data frames with conditions as values. Although in this case we only have the CSDS condition, which can take two values (“Nonstressed” and “Stressed”, with three animals each), adding more just requires us to add extra columns to the CSV file.

Filtering DeepOF objects

Now that experimental conditions were added, we’ll explore some filtering tools that DeepOF provides for table dictionary objects.

For starters, imagine you want to subset your data to only contain stressed animals. You can do that with the .filter_condition() method, which takes a dictionary as input with the experimental condition to filter on as key, and the value you’d like to keep as value:

[19]:

# Let's use coords as an example
coords = my_deepof_project.get_coords()
print("The original dataset has {} videos".format(len(coords)))

# Let's keep only those experiments where the subject is stressed:
coords = coords.filter_condition({"CSDS": "Stressed"})
print("The filtered dataset has only {} videos".format(len(coords)))

The original dataset has 6 videos
The filtered dataset has only 3 videos

We can also filter specific experiments with the .filter_videos() method, which takes a list of experiment IDs as input:

[20]:

single_video_coords = coords.filter_videos(['20191204_Day2_SI_JB08_Test_56'])
print("The new filtered dataset has only {} video".format(len(single_video_coords)))

The new filtered dataset has only 1 video

Last but not least, we can also keep all videos, but filter certain animals whose coordinates we’d like to keep for further analysis. As seen above, the dataset used in this tutorial contains two animals per video: a C57Bl6 (labelled “B”) and a CD1 (labelled “W”). Let’s see how we can keep only the C57B16 with the .filter_id() method:

Let’s first point out that, before filtering, a given experiment in the coords object has 44 features (22 from each animal).

[21]:

coords['20191204_Day2_SI_JB08_Test_56']

[21]:

bodyparts	B_Center		B_Left_bhip		B_Left_ear		B_Left_fhip		B_Nose		...	W_Right_ear		W_Right_fhip		W_Spine_1		W_Spine_2		W_Tail_base
coords	x	y	x	y	x	y	x	y	x	y	...	x	y	x	y	x	y	x	y	x	y
00:00:00	431.582129	199.886157	443.317284	191.049059	422.038894	239.280726	444.320813	211.530190	398.565403	241.218546	...	137.956781	170.942092	128.741102	195.823723	121.748451	184.261002	104.130758	218.744892	96.072895	236.658017
00:00:00.040005334	431.582129	199.886157	443.317284	191.049059	422.038894	239.280726	444.320813	211.530190	398.565403	241.218546	...	137.956781	170.942092	128.741102	195.823723	121.748451	184.261002	104.130758	218.744892	96.072895	236.658017
00:00:00.080010668	431.582129	199.886157	443.317284	191.049059	422.038894	239.280726	444.320813	211.530190	398.565403	241.218546	...	137.956781	170.942092	128.741102	195.823723	121.748451	184.261002	104.130758	218.744892	96.072895	236.658017
00:00:00.120016002	431.571022	200.002112	443.335878	190.993253	422.030086	239.327994	444.329816	211.562289	398.595961	241.294169	...	141.714985	162.880174	131.772851	188.675886	125.782873	175.795353	108.154709	209.392022	99.840060	225.096960
00:00:00.160021336	431.597640	199.745468	443.281956	190.945019	421.874045	239.267099	444.379187	211.410798	398.593955	241.218815	...	148.078345	155.219662	134.899416	181.886177	129.444463	167.556756	108.951992	202.824619	100.722245	218.990917
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
00:09:59.799973329	334.474567	96.498638	348.224364	97.645028	326.814349	119.492518	337.246908	106.565620	323.700053	135.141625	...	367.947062	111.191081	383.559105	122.628233	372.673883	129.075473	394.586151	148.768923	406.681880	159.255856
00:09:59.839978663	335.718348	97.174041	350.550550	100.151544	327.655304	119.156520	339.572653	108.236325	323.334586	134.646345	...	368.062123	112.112038	384.973044	122.831615	373.401832	129.562705	396.308258	149.158061	408.450203	159.776019
00:09:59.879983997	334.901394	96.142181	350.010325	99.189031	326.892150	117.030927	338.705496	107.086259	324.202601	130.295730	...	368.381594	113.908447	386.581337	124.784114	375.312471	131.947180	398.053729	150.564147	409.254631	160.590819
00:09:59.919989331	334.901394	96.142181	350.010325	99.189031	326.892150	117.030927	338.705496	107.086259	324.202601	130.295730	...	368.381594	113.908447	386.581337	124.784114	375.312471	131.947180	398.053729	150.564147	409.254631	160.590819
00:09:59.959994665	334.901394	96.142181	350.010325	99.189031	326.892150	117.030927	338.705496	107.086259	324.202601	130.295730	...	368.381594	113.908447	386.581337	124.784114	375.312471	131.947180	398.053729	150.564147	409.254631	160.590819

14998 rows × 44 columns

After filtering to keep only the C57Bl6 (“B”), there are only 22 features left (scroll right to see that, as expected, no features related to “W” remain):

[22]:

coords.filter_id("B")['20191204_Day2_SI_JB08_Test_56']

[22]:

bodyparts	B_Center		B_Left_bhip		B_Left_ear		B_Left_fhip		B_Nose		...	B_Right_ear		B_Right_fhip		B_Spine_1		B_Spine_2		B_Tail_base
coords	x	y	x	y	x	y	x	y	x	y	...	x	y	x	y	x	y	x	y	x	y
00:00:00	431.582129	199.886157	443.317284	191.049059	422.038894	239.280726	444.320813	211.530190	398.565403	241.218546	...	413.290642	223.459014	420.766537	207.134779	429.961921	216.150904	430.374313	184.700537	425.421944	169.926050
00:00:00.040005334	431.582129	199.886157	443.317284	191.049059	422.038894	239.280726	444.320813	211.530190	398.565403	241.218546	...	413.290642	223.459014	420.766537	207.134779	429.961921	216.150904	430.374313	184.700537	425.421944	169.926050
00:00:00.080010668	431.582129	199.886157	443.317284	191.049059	422.038894	239.280726	444.320813	211.530190	398.565403	241.218546	...	413.290642	223.459014	420.766537	207.134779	429.961921	216.150904	430.374313	184.700537	425.421944	169.926050
00:00:00.120016002	431.571022	200.002112	443.335878	190.993253	422.030086	239.327994	444.329816	211.562289	398.595961	241.294169	...	413.300820	223.461412	420.767516	207.192591	429.975867	216.158488	430.366680	184.675558	425.426543	169.916887
00:00:00.160021336	431.597640	199.745468	443.281956	190.945019	421.874045	239.267099	444.379187	211.410798	398.593955	241.218815	...	413.300526	223.434426	420.761057	207.029504	429.993971	216.054461	430.355523	184.650934	425.416170	169.926563
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
00:09:59.799973329	334.474567	96.498638	348.224364	97.645028	326.814349	119.492518	337.246908	106.565620	323.700053	135.141625	...	312.159399	118.444302	322.335877	95.900822	325.605505	103.833966	345.151219	91.364484	359.482856	88.716496
00:09:59.839978663	335.718348	97.174041	350.550550	100.151544	327.655304	119.156520	339.572653	108.236325	323.334586	134.646345	...	311.967124	117.222504	322.339963	95.787418	326.444674	103.855875	346.788969	92.072933	359.465680	89.018248
00:09:59.879983997	334.901394	96.142181	350.010325	99.189031	326.892150	117.030927	338.705496	107.086259	324.202601	130.295730	...	310.410430	116.226902	320.167959	94.313141	325.642692	102.384981	346.166320	91.468633	359.230222	88.757696
00:09:59.919989331	334.901394	96.142181	350.010325	99.189031	326.892150	117.030927	338.705496	107.086259	324.202601	130.295730	...	310.410430	116.226902	320.167959	94.313141	325.642692	102.384981	346.166320	91.468633	359.230222	88.757696
00:09:59.959994665	334.901394	96.142181	350.010325	99.189031	326.892150	117.030927	338.705496	107.086259	324.202601	130.295730	...	310.410430	116.226902	320.167959	94.313141	325.642692	102.384981	346.166320	91.468633	359.230222	88.757696

14998 rows × 22 columns

Now that we have a basic understanding of how to create and interact with a project, coordinates, and table dictionaries, let’s show some plots!

Basic visual exploration

Let’s first see some basic heatmaps per condition. All plotting functions within DeepOF are hosted in the deepof.visuals module. Among many other things, we can plot average heatmaps per experimental condition! Let’s see if we can visualize ant interesting patterns on the available data:

[23]:

sns.set_context("notebook")

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 6))

deepof.visuals.plot_heatmaps(
    my_deepof_project,
    ["B_Nose"],
    center="arena",
    exp_condition="CSDS",
    condition_value="Nonstressed",
    ax=ax1,
    show=False,
    display_arena=True,
    experiment_id="average",
)

deepof.visuals.plot_heatmaps(
    my_deepof_project,
    ["B_Nose"],
    center="arena",
    exp_condition="CSDS",
    condition_value="Stressed",
    ax=ax2,
    show=False,
    display_arena=True,
    experiment_id="average",
)

plt.tight_layout()
plt.show()

../_images/tutorial_notebooks_deepof_preprocessing_tutorial_70_0.png

It seems stressed animals spend more time closer to the walls of the arena, and less time in the center! For details on how deepof.visuals.plot_heatmap() works, feel free to check the full API reference or the function docstring.

We can also have a more detailed look at our data. As in most deepof plot functions, we can limit the time range that we want to include in our plot. For this the optional input arguments bin_index and bin_size can be used. They allow to either specify a bin size in seconds that is used to bin all data and then select one of the resulting bins via bin_index, or to directly specify the start time and duration of the segment you want to plot. Below you see the syntax for plotting the third minute 0:2:0-0:3:0 (or the third 60 seconds bin i.e. bin no. 2 as bin numbering starts at 0) for our data:

[24]:

sns.set_context("notebook")

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 6))

deepof.visuals.plot_heatmaps(
    my_deepof_project,
    ["B_Nose"],
    center="arena",
    exp_condition="CSDS",
    condition_value="Nonstressed",
    ax=ax1,
    show=False,
    display_arena=True,
    experiment_id="average",
    bin_index=2,  #plots only the second 60 seconds of teh data
    bin_size=60
)

deepof.visuals.plot_heatmaps(
    my_deepof_project,
    ["B_Nose"],
    center="arena",
    exp_condition="CSDS",
    condition_value="Nonstressed",
    ax=ax2,
    show=False,
    display_arena=True,
    experiment_id="average",
    bin_index="00:02:00",  #does the same as the above
    bin_size="00:01:00",
)

plt.tight_layout()
plt.show()

../_images/tutorial_notebooks_deepof_preprocessing_tutorial_73_0.png

If you want very specific samples to get plotted, you also can use the precomputed_bins options and enter a boolean array. Here we do the same plot as above but by directly stating which samples should be included. The remaining samples not specified in the list get filled up with “False”.

[25]:

plt.figure(figsize=(6, 6))


deepof.visuals.plot_heatmaps(
    my_deepof_project,
    ["B_Nose"],
    center="arena",
    exp_condition="CSDS",
    condition_value="Nonstressed",
    show=False,
    display_arena=True,
    experiment_id="average",
    precomputed_bins=[False]*2998+[True]*1498,
)

plt.show()

<Figure size 600x600 with 0 Axes>

../_images/tutorial_notebooks_deepof_preprocessing_tutorial_75_1.png

Furthermore, you can use regions of interest (ROIs) for spatial exploration, if you initialized them during project definition. Covering this here would be a bit much, so please have a look at the ROI tutorial for more details. For now, this is only a plot how ROIs in your data could look like:

[26]:

# Load a previously saved project
my_deepof_project_with_rois = deepof.data.load_project("./tutorial_files/sample_project/")
# And load the experiment conditions
#my_deepof_project_with_rois.load_exp_conditions("./tutorial_files/tutorial_exp_conditions.csv")

# we now only plot

sns.set_context("notebook")

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 6))

# Nose based plot
deepof.visuals.plot_heatmaps(
    my_deepof_project_with_rois,
    ["B_Center"],
    center="arena",
    exp_condition="CSDS",
    condition_value="Nonstressed",
    ax=ax1,
    show=False,
    display_arena=True,
    experiment_id="average",
    bin_index=2,
    bin_size=60,
    roi_number=1,
    animals_in_roi="B",
)

# Center based plot
deepof.visuals.plot_heatmaps(
    my_deepof_project_with_rois,
    ["B_Center"],
    center="arena",
    exp_condition="CSDS",
    condition_value="Nonstressed",
    ax=ax2,
    show=False,
    display_arena=True,
    experiment_id="average",
    bin_index=2,
    bin_size=60,
    roi_number=2,
    animals_in_roi="B",
)

plt.tight_layout()
plt.show()

../_images/tutorial_notebooks_deepof_preprocessing_tutorial_77_0.png

Finally, let’s create an animated video showing our newly preprocessed data. DeepOF can produce reconstructions of the tracks and show them as videos. All animals and the arena are displayed by default. This is particularly useful when interpreting clusters and visualizing embeddings in the unsupervised pipeline, as we’ll see in a later turorial.

[28]:

from IPython import display

video = deepof.visuals.animate_skeleton(
    my_deepof_project,
    experiment_id="20191204_Day2_SI_JB08_Test_54",
    bin_index=0,
    bin_size=20,
    sampling_rate=15,
    dpi=60,
)

html = display.HTML(video)
display.display(html)
plt.close()

What’s next

That’s it for this tutorial. Next, we’ll see how to run a supervised annotation pipeline with pretrained models!