Input/Output

Functions:

`generate_config`(project_dir, **kwargs)	Generate a config.yml file with project settings.
`check_config_validity`(config)	Check if the config is valid.
`load_config`(project_dir[, check_if_valid, ...])	Load a project config file.
`update_config`(project_dir, **kwargs)	Update the config file stored at project_dir/config.yml.
`setup_project`(project_dir[, ...])	Setup a project directory with the following structure.
`save_pca`(pca, project_dir[, pca_path])	Save a PCA model to disk.
`load_pca`(project_dir[, pca_path])	Load a PCA model from disk.
`load_checkpoint`([project_dir, model_name, ...])	Load data and model snapshot from a saved checkpoint.
`reindex_syllables_in_checkpoint`([...])	Reindex syllable labels by their frequency in the most recent model snapshot in a checkpoint file.
`extract_results`(model, metadata[, ...])	Extract model outputs and [optionally] save them to disk.
`load_results`([project_dir, model_name, path])	Load the results from a modeled dataset.
`save_results_as_csv`(results[, project_dir, ...])	Save modeling results to csv format.
`save_keypoints`(save_dir, coordinates[, ...])	Convenience function for saving keypoint detections to csv files.
`load_keypoints`(filepath_pattern, format[, ...])	Load keypoint tracking results from one or more files.
`save_hdf5`(filepath, save_dict[, datapath, ...])	Save a dict of pytrees to an hdf5 file.
`load_hdf5`(filepath[, datapath])	Load a dict of pytrees from an hdf5 file.

keypoint_moseq.io.generate_config(project_dir, **kwargs)[source]

Generate a config.yml file with project settings. Default settings will be used unless overriden by a keyword argument.

Parameters:

project_dir (str) – A file config.yml will be generated in this directory.
kwargs – Custom project settings.

keypoint_moseq.io.check_config_validity(config)[source]

Check if the config is valid.

To be valid, the config must satisfy the following criteria:

All the elements of config[“use_bodyparts”] are also in config[“bodyparts”]
All the elements of config[“anterior_bodyparts”] are also in config[“use_bodyparts”]
All the elements of config[“anterior_bodyparts”] are also in config[“use_bodyparts”]
For each pair in config[“skeleton”], both elements also in config[“bodyparts”]

Parameters:: config (dict)
Returns:: validity
Return type:: bool

keypoint_moseq.io.load_config(project_dir, check_if_valid=True, build_indexes=True)[source]

Load a project config file.

Parameters:

project_dir (str) – Directory containing the config file
check_if_valid (bool, default=True) – Check if the config is valid using keypoint_moseq.io.check_config_validity()
build_indexes (bool, default=True) – Add keys “anterior_idxs” and “posterior_idxs” to the config. Each maps to a jax array indexing the elements of config[“anterior_bodyparts”] and config[“posterior_bodyparts”] by their order in config[“use_bodyparts”]

Returns:

config

Return type:

dict

keypoint_moseq.io.update_config(project_dir, **kwargs)[source]

Update the config file stored at project_dir/config.yml.

Use keyword arguments to update key/value pairs in the config. To update model hyperparameters, just use the name of the hyperparameter as the keyword argument.

Examples

To update video_dir to /path/to/videos:

>>> update_config(project_dir, video_dir='/path/to/videos')
>>> print(load_config(project_dir)['video_dir'])
/path/to/videos

To update trans_hypparams[‘kappa’] to 100:

>>> update_config(project_dir, kappa=100)
>>> print(load_config(project_dir)['trans_hypparams']['kappa'])
100

keypoint_moseq.io.setup_project(project_dir, deeplabcut_config=None, sleap_file=None, nwb_file=None, freipose_config=None, dannce_config=None, overwrite=False, **options)[source]

Setup a project directory with the following structure:

project_dir
└── config.yml

Parameters:

project_dir (str) – Path to the project directory (relative or absolute).
deeplabcut_config (str, default=None) – Path to a deeplabcut config file. Will be used to initialize bodyparts, skeleton, use_bodyparts and video_dir in the keypoint MoSeq config (overrided by kwargs).
sleap_file (str, default=None) – Path to a .hdf5 or .slp file containing predictions for one video. Will be used to initialize bodyparts, skeleton, and use_bodyparts in the keypoint MoSeq config (overrided by kwargs).
nwb_file (str, default=None) – Path to a .nwb file containing predictions for one video. Will be used to initialize bodyparts, skeleton, and use_bodyparts in the keypoint MoSeq config. (overrided by kwargs).
freipose_config (str, default=None) – Path to a freipose skeleton config file. Will be used to initialize bodyparts, skeleton, and use_bodyparts in the keypoint MoSeq config (overrided by kwargs).
dannce_config (str, default=None) – Path to a dannce config file. Will be used to initialize bodyparts, skeleton, and use_bodyparts in the keypoint MoSeq config (overrided by kwargs).
overwrite (bool, default=False) – Overwrite any config.yml that already exists at the path {project_dir}/config.yml.
options – Used to initialize config file. Overrides default settings.

keypoint_moseq.io.save_pca(pca, project_dir, pca_path=None)[source]

Save a PCA model to disk.

The model is saved to pca_path or else to {project_dir}/pca.p.

Parameters:

pca (sklearn.decomposition.PCA)
project_dir (str)
pca_path (str, default=None)

keypoint_moseq.io.load_pca(project_dir, pca_path=None)[source]

Load a PCA model from disk.

The model is loaded from pca_path or else from {project_dir}/pca.p.

Parameters:

project_dir (str)
pca_path (str, default=None)

Returns:

pca

Return type:

sklearn.decomposition.PCA

keypoint_moseq.io.load_checkpoint(project_dir=None, model_name=None, path=None, iteration=None)[source]

Load data and model snapshot from a saved checkpoint.

The checkpoint path can be specified directly via path or else it is assumed to be {project_dir}/{model_name}/checkpoint.h5.

Parameters:

project_dir (str, default=None) – Project directory; used in conjunction with model_name to determine the checkpoint path if path is not specified.
model_name (str, default=None) – Model name; used in conjunction with project_dir to determine the checkpoint path if path is not specified.
path (str, default=None) – Checkpoint path; if not specified, the checkpoint path is set to {project_dir}/{model_name}/checkpoint.h5.
iteration (int, default=None) – Determines which model snapshot to load. If None, the last snapshot is loaded.

Returns:

model (dict) – Model dictionary containing states, parameters, hyperparameters, noise prior, and random seed.
data (dict) – Data dictionary containing observations, confidences, mask and associated metadata (see keypoint_moseq.util.format_data()).
metadata (tuple (keys, bounds)) – Recording names and start/end frames for the data (see keypoint_moseq.util.format_data()).
iteration (int) – Iteration of model fitting corresponding to the loaded snapshot.

keypoint_moseq.io.reindex_syllables_in_checkpoint(project_dir=None, model_name=None, path=None, index=None, runlength=True)[source]

Reindex syllable labels by their frequency in the most recent model snapshot in a checkpoint file.

This is an in-place operation: the checkpoint is loaded from disk, modified and saved to disk again. The label permutation is applied to all model snapshots in the checkpoint.

The checkpoint path can be specified directly via path or else it is assumed to be {project_dir}/{model_name}/checkpoint.h5.

Parameters:

project_dir (str, default=None)
model_name (str, default=None)
path (str, default=None)
index (array of shape (num_states,), default=None) – Permutation for syllable labels, where index[i] is relabled as i. If None, syllables are relabled by frequency, with the most frequent syllable relabled as 0, and so on.
runlength (bool, default=True) – If True, frequencies are quantified using the number of non-consecutive occurrences of each syllable. If False, frequency is quantified by total number of frames.

Returns:

index – The index used for permuting syllable labels. If index[i] = j, then the syllable formerly labeled j is now labeled i.

Return type:

array of shape (num_states,)

keypoint_moseq.io.extract_results(model, metadata, project_dir=None, model_name=None, save_results=True, path=None, overwrite=False)[source]

Extract model outputs and [optionally] save them to disk.

Model outputs are saved to disk as a .h5 file, either at path if it is specified, or at {project_dir}/{model_name}/results.h5 if it is not. If a .h5 file with the given path already exists, the outputs will be added to it. The results have the following structure:

results.h5
├──recording_name1
│  ├──syllable      # model state sequence (z), shape=(num_timepoints,)
│  ├──latent_state  # model latent state (x), shape=(num_timepoints,latent_dim)
│  ├──centroid      # model centroid (v), shape=(num_timepoints,keypoint_dim)
│  └──heading       # model heading (h), shape=(num_timepoints,)
⋮

Parameters:

model (dict) – Model dictionary containing states, parameters, hyperparameters, noise prior, and random seed.
metadata (tuple (keys, bounds)) – Recordings and start/end frames for the data (see keypoint_moseq.util.format_data()).
save_results (bool, default=True) – If True, the model outputs will be saved to disk.
project_dir (str, default=None) – Path to the project directory. Required if save_results=True and results_path=None.
model_name (str, default=None) – Name of the model. Required if save_results=True and results_path=None.
path (str, default=None) – Optional path for saving model outputs.
overwrite (bool, default=False) – If True, overwrite existing results for recordings that are already in the results file. This is required when applying a model to data that includes recordings from the original training set.

Returns:

results_dict – Dictionary of model outputs with the same structure as the results .h5 file.

Return type:

dict

keypoint_moseq.io.load_results(project_dir=None, model_name=None, path=None)[source]

Load the results from a modeled dataset.

The results path can be specified directly via path. Otherwise it is assumed to be {project_dir}/{model_name}/results.h5.

Parameters:

project_dir (str, default=None)
model_name (str, default=None)
path (str, default=None)

Returns:

results – See keypoint_moseq.fitting.apply_model()

Return type:

dict

keypoint_moseq.io.save_results_as_csv(results, project_dir=None, model_name=None, save_dir=None, path_sep='-')[source]

Save modeling results to csv format.

This function creates a directory and then saves a separate csv file for each recording. The directory is created at save_dir if provided, otherwise at {project_dir}/{model_name}/results.

Parameters:

results (dict) – See keypoint_moseq.io.extract_results().
project_dir (str, default=None) – Project directory; required if save_dir is not provided.
model_name (str, default=None) – Name of the model; required if save_dir is not provided.
save_dir (str, default=None) – Optional path to the directory where the csv files will be saved.
path_sep (str, default='-') – If a path separator (“/” or “”) is present in the recording name, it will be replaced with path_sep when saving the csv file.

keypoint_moseq.io.save_keypoints(save_dir, coordinates, confidences=None, bodyparts=None, path_sep='-')[source]

Convenience function for saving keypoint detections to csv files.

One csv file is saved for each recording in coordinates. Each row in the csv corresponds to one frame and the columns are named

“BODYPART1_x”, “BODYPART1_y”, “BODYPART1_conf”, “BODYPART2_x”, …

Columns with confidence scores are ommitted if confidences is not provided. Besides confidences, there can be 2 or 3 columns for each bodypart, depending on whether the keypoints are 2D or 3D.

Parameters:

save_dir (str) – Directory to save the results. A separate csv file will be saved for each recording in coordinates.
coordinates (dict) – Dictionary mapping recording names to numpy arrays of shape (n_frames, n_keypoints, 2[or 3]) that contain the x and y (and z) coordinates of the keypoints. If any keys contain a path separator (such as “/”), it will be replaced with path_sep when naming the csv file.
confidences (dict, default=None) – Dictionary mapping recording names to numpy arrays of shape (n_frames, n_keypoints) with the confidence scores of the keypoints. Must have the same keys as coordinates.
bodyparts (list, default=None) – List of bodypart names, in the same order as the keypoints in the coordinates and confidences arrays. If None, the bodypart names will be set to [“bodypart1”, “bodypart2”, …].
path_sep (str, default='-') – If a path separator (“/” or “”) is present in the recording name, it will be replaced with path_sep when saving the csv file.

keypoint_moseq.io.load_keypoints(filepath_pattern, format, extension=None, recursive=True, path_sep='-', path_in_name=False, remove_extension=True, exclude_individuals=['single'])[source]

Load keypoint tracking results from one or more files. Several file formats are supported:

deeplabcut
.csv and .h5/.hdf5 files generated by deeplabcut. For single-animal tracking, each file yields a single key/value pair in the returned coordinates and confidences dictionaries. For multi-animal tracking, a key/vaue pair will be generated for each tracked individual. For example the file two_mice.h5 with individuals “mouseA” and “mouseB” will yield the pair of keys ‘two_mice_mouseA’, ‘two_mice_mouseB’.
sleap
.slp and .h5/.hdf5 files generated by sleap. For single-animal tracking, each file yields a single key/value pair in the returned coordinates and confidences dictionaries. For multi-animal tracking, a key/vaue pair will be generated for each track. For example a single file called two_mice.h5 will yield the pair of keys ‘two_mice_track0’, ‘two_mice_track1’.
anipose
.csv files generated by anipose. Each file should contain five columns per keypoint (x,y,z,error,score), plus a last column with the frame number. The score column is used as the keypoint confidence.
sleap-anipose
.h5/.hdf5 files generated by sleap-anipose. Each file should contain a dataset called ‘tracks’ with shape (n_frames, 1, n_keypoints, 3). If there is also a ‘point_scores’ dataset, it will be used as the keypoint confidence. Otherwise, the confidence will be set to 1.
nwb
.nwb files (Neurodata Without Borders). Each file should contain exactly one PoseEstimation object (for multi-animal tracking, each animal should be stored in its own .nwb file). The PoseEstimation object should contain one PoseEstimationSeries object for each bodypart. Confidence values are optional and will be set to 1 if not present.
facemap
.h5 files saved by Facemap. See Facemap documentation for details: https://facemap.readthedocs.io/en/latest/outputs.html#keypoints-processing The files should have the format:
[filename].h5 └──Facemap ├──keypoint1 │ ├──x │ ├──y │ └──likelihood ⋮
freipose
.json files saved by FreiPose. Each file should contain a list of dicts that each include a “kp_xyz” key with the 3D coordinates for one frame. Keypoint scores (saved under “kp_score”) are not loaded because they are not bounded between 0 and 1, which is required for modeling. Since FreiPose does not save the bodypart names, the bodyparts return value is set to None.
dannce
.mat files saved by Dannce.

Parameters:

filepath_pattern (str or list of str) –
Filepath pattern for a set of deeplabcut csv or hdf5 files, or a list of such patterns. Filepath patterns can be:
- single file (e.g. /path/to/file.csv)
- single directory (e.g. /path/to/dir/)
- set of files (e.g. /path/to/fileprefix*)
- set of directories (e.g. /path/to/dirprefix*)
format (str) – Format of the files to load. Must be one of deeplabcut, sleap, anipose, or sleap-anipose.
extension (str, default=None) –
File extension to use when searching for files. If None, then the extension will be inferred from the format argument:
- sleap: ‘h5’ or ‘slp’
- deeplabcut: ‘csv’ or ‘h5’
- anipose: ‘csv’
- sleap-anipose: ‘h5’
- nwb: ‘nwb’
- facemap: ‘h5’
- freipose: ‘json’
- dannce: ‘mat’
recursive (bool, default=True) – Whether to search recursively for deeplabcut csv or hdf5 files.
path_in_name (bool, default=False) – Whether to name the tracking results from each file by the path to the file (True) or just the filename (False). If True, the path_sep argument is used to separate the path components.
path_sep (str, default='-') – Separator to use when path_in_name is True. For example, if path_sep is ‘-’, then the tracking results from the file /path/to/file.csv will be named path-to-file. Using ‘/’ as the separator is discouraged, as it will cause problems saving/loading the modeling results to/from hdf5 files.
remove_extension (bool, default=True) – Whether to remove the file extension when naming the tracking results from each file.
exclude_individuals (list of str, default=["single"]) – List of individuals to exclude from the results. This is only used for multi-animal tracking with deeplabcut.

Returns:

coordinates (dict) – Dictionary mapping filenames to keypoint coordinates as ndarrays of shape (n_frames, n_bodyparts, 2[or 3])
confidences (dict) – Dictionary mapping filenames to likelihood scores as ndarrays of shape (n_frames, n_bodyparts)
bodyparts (list of str) – List of bodypart names. The order of the names matches the order of the bodyparts in coordinates and confidences.

keypoint_moseq.io.save_hdf5(filepath, save_dict, datapath=None, exist_ok=False, overwrite=False)[source]

Save a dict of pytrees to an hdf5 file. The leaves of the pytrees must be numpy arrays, scalars, or strings.

Parameters:

filepath (str) – Path of the hdf5 file to create.
save_dict (dict) – Dictionary where the values are pytrees, i.e. recursive collections of tuples, lists, dicts, and numpy arrays.
datapath (str, default=None) – Path within the hdf5 file to save the data. If None, the data is saved at the root of the hdf5 file.
exist_ok (bool, default=False) – If False, will raise an AssertionError when trying to modify an existing file.
overwrite (bool, default=False) – If False, will raise an AssertionError when trying to overwrite an existing dataset or group.

keypoint_moseq.io.load_hdf5(filepath, datapath=None)[source]

Load a dict of pytrees from an hdf5 file.

Parameters:

filepath (str) – Path of the hdf5 file to load.
datapath (str, default=None) – Path within the hdf5 file to load the data from. If None, the data is loaded from the root of the hdf5 file.

Returns:

save_dict – Dictionary where the values are pytrees, i.e. recursive collections of tuples, lists, dicts, and numpy arrays.

Return type:

dict