Formatting your data: feature extraction from motion tracking output

Open In Colab

What we’ll cover:

  • Create and run a project.

  • Load a previously generated project.

  • Interact with your project: generate coordinates, distances, angles, and areas.

  • Exploratory visualizations: heatmaps and processed animations.

[1]:
# # If using Google colab, uncomment and run this cell and the one below to set up the environment
# # Note: because of how colab handles the installation of local packages, this cell will kill your runtime.
# # This is not an error! Just continue with the cells below.
# import os
# !git clone -q https://github.com/mlfpm/deepof.git
# !pip install -q -e deepof --progress-bar off
# os.chdir("deepof")
# !curl --output tutorial_files.zip https://datashare.mpcdf.mpg.de/s/Hu1XjZkY9zml0mm/download
# !unzip tutorial_files.zip
[2]:
# import os
# os.chdir("deepof")
# import os, warnings
# warnings.filterwarnings('ignore')

Let’s start by importing some packages. We’ll use python’s os library to handle paths, pickle to load saved objects, pandas to load data frames, and the data entry API within DeepOF, located in deepof.data

[3]:
import os
import pickle
import deepof.data

We’ll also need some plotting gear:

[4]:
import deepof.visuals
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

Creating and running a project

With that out of the way, the first thing to do when starting a DeepOF project is to load your videos and DeepLabCut tracking files into a deepof.data.Project object.

Like depicted in the cell below, the three crucial parameters to input are the project, video and tab paths.

  • project_path specifies where the project folder containing all processing and output files will be created.

  • video_path points towards where your DLC or SLEAP labelled videos are stored.

  • similarly, table_path should point to the directory containing all tracking files (see section on supported input files below).

  • last but not least, you can give your project a name (optional; deepof_project by default).

The dataset used in this tutorial is a subset of the social interaction (SI) dataset used in DeepOF’s original paper. It contains six 10-minute-long videos with two animals (a C57Bl6 and a CD1) tracked in a round arena, tracked with DeepLabCut. Three of the C57Bl6 mice have been exposed to chronic social defeat stress (CSDS).

[5]:
my_deepof_project_raw = deepof.data.Project(
    project_path=os.path.join("tutorial_files"),
    video_path=os.path.join("tutorial_files/Videos/"),
    table_path=os.path.join("tutorial_files/Tables/"),
    project_name="deepof_tutorial_project",
    arena="circular-autodetect",
    animal_ids=["B", "W"],
    table_format="h5",
    video_format=".mp4",
    exclude_bodyparts=["Tail_1", "Tail_2", "Tail_tip"],
    video_scale="380 mm",
    iterative_imputation="partial",  # "full",
    smooth_alpha=1,
    exp_conditions=None,
    # number_of_rois=3, # another optional input argument that allows you to define up to 20 different regions of interest during project creation
)
Info! Set arena dimension to 380.0 mm!
The sampling rate of some of your videos differ. The maximum difference is 20191204_Day2_SI_JB08_Test_54: 24.99 fps and 20191204_Day2_SI_JB08_Test_63: 24.994 fps! Proceed with cauthion

As you may see, there are some extra (optional) parameters in the call above. These specify some details on how the videos and tracks will be processed. Some include:

  • arena: some functions within DeepOF (as you will see in the next tutorial on supervised annotations) require the program to detect the arena. The following section explains how to handle this step in more detail.

  • animal_ids: in case more than one animal is present in the dataset, the IDs assigned to each during tracking need to be specified as a list. For SLEAP projects, this will be detected automatically, but can be overriten.

  • exclude_body_parts: while DeepOF originally relies on 14 body part tracking models, body parts along the tail are no longer used. This step could be bypassed using smaller models following an 11-body-part scheme, such as the one presented in the landing page of the documentation. Custom labelling schemes are also supported (see tutorials).

  • video_scale: diameter of the arena (in mm) if circular. In the case of polygonal arenas, the length (in mm) of the first edge as specified in the GUI (see next section) should be provided.

  • smooth_alpha: smoothing intensity. The higher the value, the more smoothing is applied.

  • exp_conditions: dictionary with a video IDs as keys, and data frames with all experimental conditions as values. DeepOF will use this information to enable all sorts of comparisons between conditions, as we will see in the following tutorials. We’ll leave it blank for now and update it afterwards.

For more details, feel free to visit the full API reference.

Let’s then create our first project, by running the .create() method in our newly created object:

[6]:
my_deepof_project = my_deepof_project_raw.create(force=True)
Setting up project directories...
Preprocessing tables          : 100%|██████████| 6/6 [00:02<00:00,  2.08it/s, file=20191204_D..., step=Saving data]
Detecting arenas              : 100%|██████████| 6/6 [01:29<00:00, 14.99s/arena]
Rescaling tables              : 100%|██████████| 6/6 [00:00<00:00, 556.74table/s]
Computing distances           : 100%|██████████| 6/6 [00:00<00:00,  8.14table/s]
Computing angles              : 100%|██████████| 6/6 [00:00<00:00, 49.80table/s]
Computing areas               : 100%|██████████| 6/6 [00:00<00:00,  6.91table/s]
Done!

NOTE: the force parameter allows DeepOF to override an existing project in the same directory. Use with caution!

NOTE 2: the cell above can take a significant amount of time to run in Google colab. Feel free to skip and continue with the loaded results below.

As you will see, this organizes all required file into a folder in the specified path. Moreover, some processing steps are computed on the go, such as distances, angles and areas. This makes it easier for DeepOF to load all required features later on, without the need to compute them every time.

Arena detection

As mentioned above, one of the key aspects of project creation involves setting up the arena detection mechanism, which will be used in downstream tasks, such as climbing detection and overall video scaling.

You can contrAlong these lines, the package provides tools for detecting elliptical arenas specifically (as seen in this tutorial), and general polygonal shapes (such as squares, rectangles, Y-mazes, etc).

In principle, DeepOF can detect your arenas automatically by relying in SAM (Segment Anything Model), a state-of-the-art image segmentation deep learning model. By selecting arena='circular-autodetect', or arena='polygonal-autodetect', users can benefit from this approach. A folder named Arena_detection will be created in your project directory, which contains samples of the detected arenas for all videos, so you can visually inspect if DeepOF did a good job.

Moreover, manual annotation can be selected using arena='circular-manual', and arena='polygonal-manual'. In this case, you will see a window per video appear, allowing you to manually mark where the arena is with just a few clicks. All results will be stored for further processing.

Here, clicking anywhere in the video will make an orange marker appear. In the case of polygonal arenas, at least a marker per corner should be used. When dealing with circular (or elliptical) arenas, DeepOF will fit an ellipse to the marked points after a minimum of 5 clicks. The ellipse can always be refined by adding more markers. Once finished with a video, press q to save and move to the next one. If you made a mistake and would like to correct it, press d to delete the last added marker. Moreover, after you tagged at least one video, you’ll see the p option appear, which will copy the last marked arena to all remaining videos in the dataset.

NOTE: If you select arena='polygonal-autodetect', you will still be prompted with the GUI to select the arena just once, so SAM roughly knows how the segmentation it should return looks like. Detection in all remaining videos will be automatic. This is not the case for arena='circular-autodetect', as we already know we’re looking for a circle.

arena_GUI

One last thing is that, in polygonal arenas, the first edge you input will be used to scale the coordinates to the proper distances (pixels to millimeters). Make sure you always mark the same edge first, and that it coincides with the length passed to the “video_scale” parameter when creating the project. In saved images of automatically detected polygons, you’ll see orange circles marking two arena corners. These should match the first edge you selected when prompting SAM, ensuring data is properly scaled.

If after all this work you make a mistake while labelling, or see that automatic detection failed in some cases, don’t worry! You can always edit manually annotated arenas (or rerun automatic annotation) for specific videos using the .edit_arenas() method. Just pass a list with the IDs of the videos you would like to relabel, and the GUI will pop up once again. The same methods described above can be passed to the arena_type parameter.

NOTE: If you don’t pass any video IDs, all videos will be selected by default.

[7]:
my_deepof_project.edit_arenas(
    video_keys=['20191204_Day2_SI_JB08_Test_54', '20191204_Day2_SI_JB08_Test_62'],
    arena_type="circular-autodetect",
)
Editing 2 arenas
Detecting arenas              : 100%|██████████| 2/2 [00:02<00:00,  1.16s/arena]
Done!

Supported input tracking files:

DeepOF currently supports input from both DeepLabCut and SLEAP. While DeepOF will try to detect automatically which file type you’re trying to use, this can be forced with the table_format parameter in the deepof.data.Project() call depicted in the previous section.

For DeepLabCut, we support:

  • Single and multi-animal project CSV files (indicated simply with table_format='csv'.

  • Single and multi-animal project h5 files (indicated simply with table_format='h5'.

For SLEAP, you can use:

  • SLP files (indicated as with table_format='slp'.

  • Raw .npy files (indicated with table_format='npy'.

  • h5 files (indicated with table_format=analysis.h5.

Downstream processing should be identical regardless of the files selected, as DeepOF internally brings all these inputs into an equivalent representation. Let’s continue!

Loading a previously generated project

Once you ran your project at least once, it can be loaded without much effort from the path you specified in the first place (plus the project name -deepof_project, if you didn’t specify otherwise-).

[8]:
# Load a previously saved project
my_deepof_project = deepof.data.load_project("./tutorial_files/deepof_tutorial_project/")

Extend a previously generated project

If you’d like to add data (videos and tracks) to a previously processed project, you can use the extend method instead of create. Just pass the path to your previously processed project, and DeepOF will take care of merging the two. A new directory will be created with all files corresponding to the merged projects.

NOTE: Your old files will not be deleted.

[9]:
# Extend a previously saved project
my_deepof_project=my_deepof_project_raw.extend(
    "./tutorial_files/deepof_tutorial_project/",)
Loading previous project...
Processing data from 0 experiments...

Interacting with your project: generating coordinates, distances, angles, and areas.

That’s it for basic project creation. We now have a DeepOF project up and running! Before ending the tutorial, however, let’s explore the object that the commands above produced.

For starters, if we print it, we see it’s a DeepOF analysis of 6 videos. Furthermore, the object belongs to a custom class within DeepOF, called Coordinates. This class allows the package to store all sorts of relevant information required for further processing, as we’ll see below:

[10]:
print(my_deepof_project)
print(type(my_deepof_project))
deepof analysis of 6 videos
<class 'deepof.data.Coordinates'>

As described before, the .create() method runs most of the heavy preprocessing already, which allows us to extract features including coordinates, distances, angles, and areas. Let’s see how that works!

preprocessing

With the .get_coords() method, for example, we can obtain the processed (smooth and imputed) tracks for all videos in a dictionary. The returned objects are called table dictionaries (TableDict is the name of the class). They follow a dictionary-like structure, where each value is a data frame. They also provide a plethora of extra methods, some of which we’ll cover in these tutorials. Let’s retrieve these for one of the animals:

[11]:
my_deepof_project.get_coords(polar=False, center="Center", align="Spine_1")['20191204_Day2_SI_JB08_Test_54']
[11]:
B_Spine_1 B_Center B_Left_bhip B_Left_ear B_Left_fhip ... W_Right_bhip W_Right_ear W_Right_fhip W_Spine_2 W_Tail_base
x y x y x y x y x y ... x y x y x y x y x y
00:00:00 0.0 13.325157 0.0 0.0 9.299380 -10.043816 0.078640 28.362360 11.113320 11.260838 ... -12.599950 -11.240405 -10.078637 43.152743 -11.573392 10.377854 0.642794 -17.147886 1.430863 -33.725658
00:00:00.040002666 0.0 13.325157 0.0 0.0 9.299380 -10.043816 0.078640 28.362360 11.113320 11.260838 ... -12.599950 -11.240405 -10.078637 43.152743 -11.573392 10.377854 0.642794 -17.147886 1.430863 -33.725658
00:00:00.080005333 0.0 13.325157 0.0 0.0 9.299380 -10.043816 0.078640 28.362360 11.113320 11.260838 ... -12.599950 -11.240405 -10.078637 43.152743 -11.573392 10.377854 0.642794 -17.147886 1.430863 -33.725658
00:00:00.120008 0.0 12.329721 0.0 0.0 8.271627 -11.473263 -0.290829 28.613626 12.340172 9.265053 ... -11.723291 -13.076021 -9.691817 42.347617 -11.806335 10.703766 1.061070 -17.959261 2.281181 -34.389720
00:00:00.160010667 0.0 12.184032 0.0 0.0 8.779927 -12.331783 0.698089 27.811298 12.141177 8.484510 ... -11.366711 -14.721331 -16.173151 40.094898 -12.072984 9.174599 1.584101 -18.188027 1.854913 -35.989727
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
00:09:59.799986665 0.0 14.084511 0.0 0.0 12.934637 -9.662336 8.634394 28.686438 11.999912 9.135109 ... -16.865593 -5.559382 -14.019423 25.202356 -9.893384 8.524650 -5.093286 -12.389494 -14.948009 -21.649407
00:09:59.839989332 0.0 12.209807 0.0 0.0 12.787276 -10.919776 5.867447 31.792569 12.005965 7.366316 ... -16.958799 -5.706550 -14.031198 25.126744 -9.933026 8.448160 -5.039922 -12.542700 -14.891456 -21.791515
00:09:59.879991999 0.0 15.747280 0.0 0.0 10.409651 -9.370631 6.176041 34.226407 12.033207 9.874782 ... -16.943784 -5.714167 -14.067525 25.075804 -9.916492 8.418803 -5.006992 -12.561188 -14.852969 -21.841110
00:09:59.919994666 0.0 15.747280 0.0 0.0 10.409651 -9.370631 6.176041 34.226407 12.033207 9.874782 ... -16.943784 -5.714167 -14.067525 25.075804 -9.916492 8.418803 -5.006992 -12.561188 -14.852969 -21.841110
00:09:59.959997333 0.0 15.747280 0.0 0.0 10.409651 -9.370631 6.176041 34.226407 12.033207 9.874782 ... -16.943784 -5.714167 -14.067525 25.075804 -9.916492 8.418803 -5.006992 -12.561188 -14.852969 -21.841110

14999 rows × 44 columns

Note that there are a few parameters you can pass to the .get_coords() method. If “polar” is set to True, polar coordinates will be used, instead of Cartesian. Both “center” and “align” control how translational and rotational variance are removed from the data: the former will use the specified body part as [0, 0] (or the center of the arena, if set to “arena”). The latter will rotate the mice on each time point, to align the line connecting the body parts specified in the “center” and “align” parameters with the y-axis.

Furthermore, not only processed coordinates can be retrieved, but also distances, angles, and areas with the .get_distances(), .get_angles(), and .get_areas(), respectively.

[12]:
my_deepof_project.get_distances()['20191204_Day2_SI_JB08_Test_54']
[12]:
(W_Right_ear, W_Spine_1) (B_Center, B_Spine_2) (W_Center, W_Right_fhip) (W_Left_bhip, W_Spine_2) (B_Right_bhip, B_Spine_2) (W_Right_bhip, W_Spine_2) (W_Left_ear, W_Nose) (B_Center, B_Left_fhip) (B_Left_ear, B_Spine_1) (B_Nose, B_Right_ear) ... (B_Nose, W_Nose) (W_Center, W_Spine_2) (W_Center, W_Spine_1) (B_Left_bhip, B_Spine_2) (B_Nose, W_Tail_base) (B_Left_ear, B_Nose) (B_Center, B_Spine_1) (B_Right_ear, B_Spine_1) (W_Nose, W_Right_ear) (W_Center, W_Left_fhip)
00:00:00 25.324534 13.836819 15.544879 14.450448 11.412604 14.500642 22.222081 15.821263 15.037409 15.057739 ... 202.543916 17.159929 19.920155 14.600725 122.652715 15.170325 13.325157 14.429184 24.791305 20.121679
00:00:00.040002666 25.324534 13.836819 15.544879 14.450448 11.412604 14.500642 22.222081 15.821263 15.037409 15.057739 ... 202.543916 17.159929 19.920155 14.600725 122.652715 15.170325 13.325157 14.429184 24.791305 20.121679
00:00:00.080005333 25.324534 13.836819 15.544879 14.450448 11.412604 14.500642 22.222081 15.821263 15.037409 15.057739 ... 202.543916 17.159929 19.920155 14.600725 122.652715 15.170325 13.325157 14.429184 24.791305 20.121679
00:00:00.120008 24.165120 14.114840 15.936128 15.091938 11.977952 13.685244 20.982982 15.431172 16.286502 15.112614 ... 201.439640 17.990579 20.211183 14.741844 122.474901 16.038285 12.329721 15.581831 24.705419 19.739289
00:00:00.160010667 25.009417 13.728727 15.163450 15.075836 12.857731 13.406770 21.167894 14.811991 15.642850 16.235095 ... 200.696461 18.256880 21.018731 14.977542 121.775639 18.028338 12.184032 15.989769 22.790682 20.680558
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
00:09:59.799986665 17.263132 16.876614 13.059429 15.891442 13.647532 13.610203 23.477525 15.081383 16.963757 22.674734 ... 148.312575 13.395564 15.129050 13.789807 191.389073 20.974630 14.084511 16.974595 19.277812 15.207663
00:09:59.839989332 17.257318 19.611009 13.039801 15.901995 13.324806 13.740181 23.502572 14.085660 20.442884 22.789394 ... 144.670507 13.517401 15.079828 13.468828 186.729658 16.403314 12.209807 17.378500 19.322534 15.197649
00:09:59.879991999 17.240931 17.657619 13.008192 15.937232 13.673161 13.761130 23.463719 15.566290 19.483881 21.730108 ... 146.851049 13.522330 15.108134 13.742712 186.008433 18.592350 15.747280 17.947809 19.336323 15.241505
00:09:59.919994666 17.240931 17.657619 13.008192 15.937232 13.673161 13.761130 23.463719 15.566290 19.483881 21.730108 ... 146.851049 13.522330 15.108134 13.742712 186.008433 18.592350 15.747280 17.947809 19.336323 15.241505
00:09:59.959997333 17.240931 17.657619 13.008192 15.937232 13.673161 13.761130 23.463719 15.566290 19.483881 21.730108 ... 146.851049 13.522330 15.108134 13.742712 186.008433 18.592350 15.747280 17.947809 19.336323 15.241505

14999 rows × 26 columns

[13]:
my_deepof_project.get_angles()['20191204_Day2_SI_JB08_Test_54']
[13]:
(B_Right_fhip, B_Center, B_Spine_2) (B_Right_fhip, B_Center, B_Spine_1) (B_Right_fhip, B_Center, B_Left_fhip) (B_Spine_2, B_Center, B_Spine_1) (B_Spine_2, B_Center, B_Left_fhip) (B_Spine_1, B_Center, B_Left_fhip) (B_Spine_1, B_Left_ear, B_Nose) (B_Left_bhip, B_Spine_2, B_Center) (B_Left_bhip, B_Spine_2, B_Right_bhip) (B_Left_bhip, B_Spine_2, B_Tail_base) ... (W_Left_bhip, W_Spine_2, W_Right_bhip) (W_Left_bhip, W_Spine_2, W_Tail_base) (W_Center, W_Spine_2, W_Right_bhip) (W_Center, W_Spine_2, W_Tail_base) (W_Right_bhip, W_Spine_2, W_Tail_base) (W_Center, W_Spine_1, W_Left_ear) (W_Center, W_Spine_1, W_Right_ear) (W_Left_ear, W_Spine_1, W_Right_ear) (W_Spine_1, W_Right_ear, W_Nose) (W_Left_ear, W_Nose, W_Right_ear)
00:00:00 1.832922 0.937362 1.716167 2.770285 2.734096 0.778805 2.020075 1.003026 2.381443 2.334903 ... 2.058565 2.186722 1.113729 3.131558 2.037898 2.710147 2.732280 0.840758 2.234005 0.953038
00:00:00.040002666 1.832922 0.937362 1.716167 2.770285 2.734096 0.778805 2.020075 1.003026 2.381443 2.334903 ... 2.058565 2.186722 1.113729 3.131558 2.037898 2.710147 2.732280 0.840758 2.234005 0.953038
00:00:00.080005333 1.832922 0.937362 1.716167 2.770285 2.734096 0.778805 2.020075 1.003026 2.381443 2.334903 ... 2.058565 2.186722 1.113729 3.131558 2.037898 2.710147 2.732280 0.840758 2.234005 0.953038
00:00:00.120008 1.679362 0.989265 1.916045 2.668627 2.687778 0.926781 1.965510 1.023683 2.488609 2.435508 ... 2.071008 2.202391 1.146916 3.126483 2.009787 2.728616 2.728912 0.825657 2.292104 0.982455
00:00:00.160010667 1.718482 0.954757 1.915619 2.673239 2.649084 0.960862 2.034714 1.107882 2.542842 2.416358 ... 2.072768 2.362859 1.222370 3.069928 1.847558 2.963804 2.438366 0.881016 2.173166 1.044266
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
00:09:59.799986665 2.310751 0.899521 1.819645 3.072914 2.152789 0.920125 2.319123 1.092302 2.112860 1.856711 ... 2.630330 2.372843 1.435111 2.715123 1.280012 2.958156 2.193847 1.131182 2.379175 0.923178
00:09:59.839989332 2.217032 1.046089 2.066558 3.020065 1.999595 1.020470 2.551768 1.004903 1.990132 1.987859 ... 2.624470 2.384113 1.432123 2.706725 1.274602 2.961729 2.192207 1.129249 2.382483 0.921390
00:09:59.879991999 2.322632 0.787236 1.670839 3.109868 2.289714 0.883604 2.375213 0.892613 1.857147 2.173421 ... 2.619428 2.387161 1.429316 2.705913 1.276597 2.965252 2.187243 1.130690 2.379106 0.923823
00:09:59.919994666 2.322632 0.787236 1.670839 3.109868 2.289714 0.883604 2.375213 0.892613 1.857147 2.173421 ... 2.619428 2.387161 1.429316 2.705913 1.276597 2.965252 2.187243 1.130690 2.379106 0.923823
00:09:59.959997333 2.322632 0.787236 1.670839 3.109868 2.289714 0.883604 2.375213 0.892613 1.857147 2.173421 ... 2.619428 2.387161 1.429316 2.705913 1.276597 2.965252 2.187243 1.130690 2.379106 0.923823

14999 rows × 36 columns

[14]:
my_deepof_project.get_areas()['20191204_Day2_SI_JB08_Test_54']
[14]:
B_head_area B_torso_area B_back_area B_full_area W_head_area W_torso_area W_back_area W_full_area
00:00:00 186.379602 286.877920 322.346263 1031.762726 403.882917 432.800448 457.418159 1759.904145
00:00:00.040002666 186.379602 286.877920 322.346263 1031.762726 403.882917 432.800448 457.418159 1759.904145
00:00:00.080005333 186.379602 286.877920 322.346263 1031.762726 403.882917 432.800448 457.418159 1759.904145
00:00:00.120008 220.588695 301.609605 325.358041 1099.050813 436.529889 458.405810 465.829889 1788.957183
00:00:00.160010667 222.261708 299.870852 335.142348 1120.455237 386.996017 468.328898 464.115153 1794.118011
... ... ... ... ... ... ... ... ...
00:09:59.799986665 239.351739 336.178202 369.570132 1236.107480 338.985789 306.291533 398.589194 1296.123235
00:09:59.839989332 225.022634 337.582968 356.664936 1202.726101 339.620045 307.614991 401.612088 1299.472641
00:09:59.879991999 254.932258 347.361788 366.539204 1240.352799 339.318555 307.496487 402.713951 1300.378274
00:09:59.919994666 254.932258 347.361788 366.539204 1240.352799 339.318555 307.496487 402.713951 1300.378274
00:09:59.959997333 254.932258 347.361788 366.539204 1240.352799 339.318555 307.496487 402.713951 1300.378274

14999 rows × 8 columns

Last but not least, features can be merged using the .merge() method, which can yield combinations of features if needed. For example, the code in the following cell creates an object with both coordinates and areas per time point:

[15]:
my_deepof_project.get_coords().merge(my_deepof_project.get_areas())['20191204_Day2_SI_JB08_Test_54']
[15]:
(B_Center, x) (B_Center, y) (B_Left_bhip, x) (B_Left_bhip, y) (B_Left_ear, x) (B_Left_ear, y) (B_Left_fhip, x) (B_Left_fhip, y) (B_Nose, x) (B_Nose, y) ... (W_Tail_base, x) (W_Tail_base, y) B_head_area B_torso_area B_back_area B_full_area W_head_area W_torso_area W_back_area W_full_area
00:00:00 178.695067 150.903525 178.133481 164.579831 160.236194 129.369839 162.932410 149.543005 166.288734 115.459213 ... 284.276208 81.953327 186.379602 286.877920 322.346263 1031.762726 403.882917 432.800448 457.418159 1759.904145
00:00:00.040002666 178.695067 150.903525 178.133481 164.579831 160.236194 129.369839 162.932410 149.543005 166.288734 115.459213 ... 284.276208 81.953327 186.379602 286.877920 322.346263 1031.762726 403.882917 432.800448 457.418159 1759.904145
00:00:00.080005333 178.695067 150.903525 178.133481 164.579831 160.236194 129.369839 162.932410 149.543005 166.288734 115.459213 ... 284.276208 81.953327 186.379602 286.877920 322.346263 1031.762726 403.882917 432.800448 457.418159 1759.904145
00:00:00.120008 177.009679 149.650527 175.106624 163.666017 163.807835 124.262824 161.761899 147.278549 174.192032 112.040097 ... 293.070473 82.577959 220.588695 301.609605 325.358041 1099.050813 436.529889 458.405810 465.829889 1788.957183
00:00:00.160010667 177.767518 147.833830 173.779126 162.437004 167.520924 121.969512 163.445137 144.056996 179.274246 108.299092 ... 299.036433 86.246318 222.261708 299.870852 335.142348 1120.455237 386.996017 468.328898 464.115153 1794.118011
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
00:09:59.799986665 87.802400 245.478268 103.222546 240.694477 86.352888 275.400892 96.059204 258.098620 74.008872 292.358497 ... 217.058911 419.505907 239.351739 336.178202 369.570132 1236.107480 338.985789 306.291533 398.589194 1296.123235
00:09:59.839989332 87.996004 251.604046 103.329278 244.701389 84.727063 283.767821 97.459237 262.037312 75.703366 297.466054 ... 217.025629 419.515154 225.022634 337.582968 356.664936 1202.726101 339.620045 307.614991 401.612088 1299.472641
00:09:59.879991999 88.189609 257.729824 101.048434 252.178232 83.101238 292.134750 96.435058 270.932928 70.169244 305.492857 ... 217.112048 419.540870 254.932258 347.361788 366.539204 1240.352799 339.318555 307.496487 402.713951 1300.378274
00:09:59.919994666 88.189609 257.729824 101.048434 252.178232 83.101238 292.134750 96.435058 270.932928 70.169244 305.492857 ... 217.112048 419.540870 254.932258 347.361788 366.539204 1240.352799 339.318555 307.496487 402.713951 1300.378274
00:09:59.959997333 88.189609 257.729824 101.048434 252.178232 83.101238 292.134750 96.435058 270.932928 70.169244 305.492857 ... 217.112048 419.540870 254.932258 347.361788 366.539204 1240.352799 339.318555 307.496487 402.713951 1300.378274

14999 rows × 52 columns

Loading experimental conditions

So far, DeepOF does not know to which condition each animal belongs. This can be either set up when creating the project (as described above) or specified afterward using the .load_exp_conditions() method. We just need to pass the path to a CSV file containing all conditions per animal as extra columns. The only hard requirement is that the first column should have the experiment IDs.

Here is an example:

[16]:
pd.read_csv("./tutorial_files/tutorial_exp_conditions.csv", index_col=0)
[16]:
experiment_id CSDS
0 20191204_Day2_SI_JB08_Test_54 Nonstressed
1 20191204_Day2_SI_JB08_Test_56 Stressed
2 20191204_Day2_SI_JB08_Test_61 Stressed
3 20191204_Day2_SI_JB08_Test_62 Stressed
4 20191204_Day2_SI_JB08_Test_63 Nonstressed
5 20191204_Day2_SI_JB08_Test_64 Nonstressed

Great! Now that we understand how the CSV should be formatted, let’s then load it onto our project:

[17]:
my_deepof_project.load_exp_conditions("./tutorial_files/tutorial_exp_conditions.csv")

And we’re done!. Let’s explore what’s in there with the .get_exp_conditions property:

[18]:
print(my_deepof_project.get_exp_conditions)
{'20191204_Day2_SI_JB08_Test_54':           CSDS
0  Nonstressed, '20191204_Day2_SI_JB08_Test_56':        CSDS
1  Stressed, '20191204_Day2_SI_JB08_Test_61':        CSDS
2  Stressed, '20191204_Day2_SI_JB08_Test_62':        CSDS
3  Stressed, '20191204_Day2_SI_JB08_Test_63':           CSDS
4  Nonstressed, '20191204_Day2_SI_JB08_Test_64':           CSDS
5  Nonstressed}

We can see that the property retrieves a dictionary with all animal experiments as keys, and data frames with conditions as values. Although in this case we only have the CSDS condition, which can take two values (“Nonstressed” and “Stressed”, with three animals each), adding more just requires us to add extra columns to the CSV file.

Filtering DeepOF objects

Now that experimental conditions were added, we’ll explore some filtering tools that DeepOF provides for table dictionary objects.

For starters, imagine you want to subset your data to only contain stressed animals. You can do that with the .filter_condition() method, which takes a dictionary as input with the experimental condition to filter on as key, and the value you’d like to keep as value:

[19]:
# Let's use coords as an example
coords = my_deepof_project.get_coords()
print("The original dataset has {} videos".format(len(coords)))

# Let's keep only those experiments where the subject is stressed:
coords = coords.filter_condition({"CSDS": "Stressed"})
print("The filtered dataset has only {} videos".format(len(coords)))
The original dataset has 6 videos
The filtered dataset has only 3 videos

We can also filter specific experiments with the .filter_videos() method, which takes a list of experiment IDs as input:

[20]:
single_video_coords = coords.filter_videos(['20191204_Day2_SI_JB08_Test_56'])
print("The new filtered dataset has only {} video".format(len(single_video_coords)))
The new filtered dataset has only 1 video

Last but not least, we can also keep all videos, but filter certain animals whose coordinates we’d like to keep for further analysis. As seen above, the dataset used in this tutorial contains two animals per video: a C57Bl6 (labelled “B”) and a CD1 (labelled “W”). Let’s see how we can keep only the C57B16 with the .filter_id() method:

Let’s first point out that, before filtering, a given experiment in the coords object has 44 features (22 from each animal).

[21]:
coords['20191204_Day2_SI_JB08_Test_56']
[21]:
bodyparts B_Center B_Left_bhip B_Left_ear B_Left_fhip B_Nose ... W_Right_ear W_Right_fhip W_Spine_1 W_Spine_2 W_Tail_base
coords x y x y x y x y x y ... x y x y x y x y x y
00:00:00 431.582129 199.886157 443.317284 191.049059 422.038894 239.280726 444.320813 211.530190 398.565403 241.218546 ... 137.956781 170.942092 128.741102 195.823723 121.748451 184.261002 104.130758 218.744892 96.072895 236.658017
00:00:00.040005334 431.582129 199.886157 443.317284 191.049059 422.038894 239.280726 444.320813 211.530190 398.565403 241.218546 ... 137.956781 170.942092 128.741102 195.823723 121.748451 184.261002 104.130758 218.744892 96.072895 236.658017
00:00:00.080010668 431.582129 199.886157 443.317284 191.049059 422.038894 239.280726 444.320813 211.530190 398.565403 241.218546 ... 137.956781 170.942092 128.741102 195.823723 121.748451 184.261002 104.130758 218.744892 96.072895 236.658017
00:00:00.120016002 431.571022 200.002112 443.335878 190.993253 422.030086 239.327994 444.329816 211.562289 398.595961 241.294169 ... 141.714985 162.880174 131.772851 188.675886 125.782873 175.795353 108.154709 209.392022 99.840060 225.096960
00:00:00.160021336 431.597640 199.745468 443.281956 190.945019 421.874045 239.267099 444.379187 211.410798 398.593955 241.218815 ... 148.078345 155.219662 134.899416 181.886177 129.444463 167.556756 108.951992 202.824619 100.722245 218.990917
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
00:09:59.799973329 334.474567 96.498638 348.224364 97.645028 326.814349 119.492518 337.246908 106.565620 323.700053 135.141625 ... 367.947062 111.191081 383.559105 122.628233 372.673883 129.075473 394.586151 148.768923 406.681880 159.255856
00:09:59.839978663 335.718348 97.174041 350.550550 100.151544 327.655304 119.156520 339.572653 108.236325 323.334586 134.646345 ... 368.062123 112.112038 384.973044 122.831615 373.401832 129.562705 396.308258 149.158061 408.450203 159.776019
00:09:59.879983997 334.901394 96.142181 350.010325 99.189031 326.892150 117.030927 338.705496 107.086259 324.202601 130.295730 ... 368.381594 113.908447 386.581337 124.784114 375.312471 131.947180 398.053729 150.564147 409.254631 160.590819
00:09:59.919989331 334.901394 96.142181 350.010325 99.189031 326.892150 117.030927 338.705496 107.086259 324.202601 130.295730 ... 368.381594 113.908447 386.581337 124.784114 375.312471 131.947180 398.053729 150.564147 409.254631 160.590819
00:09:59.959994665 334.901394 96.142181 350.010325 99.189031 326.892150 117.030927 338.705496 107.086259 324.202601 130.295730 ... 368.381594 113.908447 386.581337 124.784114 375.312471 131.947180 398.053729 150.564147 409.254631 160.590819

14998 rows × 44 columns

After filtering to keep only the C57Bl6 (“B”), there are only 22 features left (scroll right to see that, as expected, no features related to “W” remain):

[22]:
coords.filter_id("B")['20191204_Day2_SI_JB08_Test_56']
[22]:
bodyparts B_Center B_Left_bhip B_Left_ear B_Left_fhip B_Nose ... B_Right_ear B_Right_fhip B_Spine_1 B_Spine_2 B_Tail_base
coords x y x y x y x y x y ... x y x y x y x y x y
00:00:00 431.582129 199.886157 443.317284 191.049059 422.038894 239.280726 444.320813 211.530190 398.565403 241.218546 ... 413.290642 223.459014 420.766537 207.134779 429.961921 216.150904 430.374313 184.700537 425.421944 169.926050
00:00:00.040005334 431.582129 199.886157 443.317284 191.049059 422.038894 239.280726 444.320813 211.530190 398.565403 241.218546 ... 413.290642 223.459014 420.766537 207.134779 429.961921 216.150904 430.374313 184.700537 425.421944 169.926050
00:00:00.080010668 431.582129 199.886157 443.317284 191.049059 422.038894 239.280726 444.320813 211.530190 398.565403 241.218546 ... 413.290642 223.459014 420.766537 207.134779 429.961921 216.150904 430.374313 184.700537 425.421944 169.926050
00:00:00.120016002 431.571022 200.002112 443.335878 190.993253 422.030086 239.327994 444.329816 211.562289 398.595961 241.294169 ... 413.300820 223.461412 420.767516 207.192591 429.975867 216.158488 430.366680 184.675558 425.426543 169.916887
00:00:00.160021336 431.597640 199.745468 443.281956 190.945019 421.874045 239.267099 444.379187 211.410798 398.593955 241.218815 ... 413.300526 223.434426 420.761057 207.029504 429.993971 216.054461 430.355523 184.650934 425.416170 169.926563
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
00:09:59.799973329 334.474567 96.498638 348.224364 97.645028 326.814349 119.492518 337.246908 106.565620 323.700053 135.141625 ... 312.159399 118.444302 322.335877 95.900822 325.605505 103.833966 345.151219 91.364484 359.482856 88.716496
00:09:59.839978663 335.718348 97.174041 350.550550 100.151544 327.655304 119.156520 339.572653 108.236325 323.334586 134.646345 ... 311.967124 117.222504 322.339963 95.787418 326.444674 103.855875 346.788969 92.072933 359.465680 89.018248
00:09:59.879983997 334.901394 96.142181 350.010325 99.189031 326.892150 117.030927 338.705496 107.086259 324.202601 130.295730 ... 310.410430 116.226902 320.167959 94.313141 325.642692 102.384981 346.166320 91.468633 359.230222 88.757696
00:09:59.919989331 334.901394 96.142181 350.010325 99.189031 326.892150 117.030927 338.705496 107.086259 324.202601 130.295730 ... 310.410430 116.226902 320.167959 94.313141 325.642692 102.384981 346.166320 91.468633 359.230222 88.757696
00:09:59.959994665 334.901394 96.142181 350.010325 99.189031 326.892150 117.030927 338.705496 107.086259 324.202601 130.295730 ... 310.410430 116.226902 320.167959 94.313141 325.642692 102.384981 346.166320 91.468633 359.230222 88.757696

14998 rows × 22 columns

Now that we have a basic understanding of how to create and interact with a project, coordinates, and table dictionaries, let’s show some plots!

Basic visual exploration

Let’s first see some basic heatmaps per condition. All plotting functions within DeepOF are hosted in the deepof.visuals module. Among many other things, we can plot average heatmaps per experimental condition! Let’s see if we can visualize ant interesting patterns on the available data:

[23]:
sns.set_context("notebook")

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 6))

deepof.visuals.plot_heatmaps(
    my_deepof_project,
    ["B_Nose"],
    center="arena",
    exp_condition="CSDS",
    condition_value="Nonstressed",
    ax=ax1,
    show=False,
    display_arena=True,
    experiment_id="average",
)

deepof.visuals.plot_heatmaps(
    my_deepof_project,
    ["B_Nose"],
    center="arena",
    exp_condition="CSDS",
    condition_value="Stressed",
    ax=ax2,
    show=False,
    display_arena=True,
    experiment_id="average",
)

plt.tight_layout()
plt.show()
../_images/tutorial_notebooks_deepof_preprocessing_tutorial_70_0.png

It seems stressed animals spend more time closer to the walls of the arena, and less time in the center! For details on how deepof.visuals.plot_heatmap() works, feel free to check the full API reference or the function docstring.

We can also have a more detailed look at our data. As in most deepof plot functions, we can limit the time range that we want to include in our plot. For this the optional input arguments bin_index and bin_size can be used. They allow to either specify a bin size in seconds that is used to bin all data and then select one of the resulting bins via bin_index, or to directly specify the start time and duration of the segment you want to plot. Below you see the syntax for plotting the third minute 0:2:0-0:3:0 (or the third 60 seconds bin i.e. bin no. 2 as bin numbering starts at 0) for our data:

[24]:
sns.set_context("notebook")

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 6))

deepof.visuals.plot_heatmaps(
    my_deepof_project,
    ["B_Nose"],
    center="arena",
    exp_condition="CSDS",
    condition_value="Nonstressed",
    ax=ax1,
    show=False,
    display_arena=True,
    experiment_id="average",
    bin_index=2,  #plots only the second 60 seconds of teh data
    bin_size=60
)

deepof.visuals.plot_heatmaps(
    my_deepof_project,
    ["B_Nose"],
    center="arena",
    exp_condition="CSDS",
    condition_value="Nonstressed",
    ax=ax2,
    show=False,
    display_arena=True,
    experiment_id="average",
    bin_index="00:02:00",  #does the same as the above
    bin_size="00:01:00",
)

plt.tight_layout()
plt.show()
../_images/tutorial_notebooks_deepof_preprocessing_tutorial_73_0.png

If you want very specific samples to get plotted, you also can use the precomputed_bins options and enter a boolean array. Here we do the same plot as above but by directly stating which samples should be included. The remaining samples not specified in the list get filled up with “False”.

[25]:
plt.figure(figsize=(6, 6))


deepof.visuals.plot_heatmaps(
    my_deepof_project,
    ["B_Nose"],
    center="arena",
    exp_condition="CSDS",
    condition_value="Nonstressed",
    show=False,
    display_arena=True,
    experiment_id="average",
    precomputed_bins=[False]*2998+[True]*1498,
)

plt.show()
<Figure size 600x600 with 0 Axes>
../_images/tutorial_notebooks_deepof_preprocessing_tutorial_75_1.png

Furthermore, you can use regions of interest (ROIs) for spatial exploration, if you initialized them during project definition. Covering this here would be a bit much, so please have a look at the ROI tutorial for more details. For now, this is only a plot how ROIs in your data could look like:

[26]:
# Load a previously saved project
my_deepof_project_with_rois = deepof.data.load_project("./tutorial_files/sample_project/")
# And load the experiment conditions
#my_deepof_project_with_rois.load_exp_conditions("./tutorial_files/tutorial_exp_conditions.csv")

# we now only plot

sns.set_context("notebook")

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 6))

# Nose based plot
deepof.visuals.plot_heatmaps(
    my_deepof_project_with_rois,
    ["B_Center"],
    center="arena",
    exp_condition="CSDS",
    condition_value="Nonstressed",
    ax=ax1,
    show=False,
    display_arena=True,
    experiment_id="average",
    bin_index=2,
    bin_size=60,
    roi_number=1,
    animals_in_roi="B",
)

# Center based plot
deepof.visuals.plot_heatmaps(
    my_deepof_project_with_rois,
    ["B_Center"],
    center="arena",
    exp_condition="CSDS",
    condition_value="Nonstressed",
    ax=ax2,
    show=False,
    display_arena=True,
    experiment_id="average",
    bin_index=2,
    bin_size=60,
    roi_number=2,
    animals_in_roi="B",
)

plt.tight_layout()
plt.show()
../_images/tutorial_notebooks_deepof_preprocessing_tutorial_77_0.png

Finally, let’s create an animated video showing our newly preprocessed data. DeepOF can produce reconstructions of the tracks and show them as videos. All animals and the arena are displayed by default. This is particularly useful when interpreting clusters and visualizing embeddings in the unsupervised pipeline, as we’ll see in a later turorial.

[28]:
from IPython import display

video = deepof.visuals.animate_skeleton(
    my_deepof_project,
    experiment_id="20191204_Day2_SI_JB08_Test_54",
    bin_index=0,
    bin_size=20,
    sampling_rate=15,
    dpi=60,
)

html = display.HTML(video)
display.display(html)
plt.close()

What’s next

That’s it for this tutorial. Next, we’ll see how to run a supervised annotation pipeline with pretrained models!