Input/Output
Functions:
|
Generate a config.yml file with project settings. |
|
Check if the config is valid. |
|
Load a project config file. |
|
Update the config file stored at project_dir/config.yml. |
|
Setup a project directory with the following structure. |
|
Save a PCA model to disk. |
|
Load a PCA model from disk. |
|
Load data and model snapshot from a saved checkpoint. |
Reindex syllable labels by their frequency in the most recent model snapshot in a checkpoint file. |
|
|
Extract model outputs and [optionally] save them to disk. |
|
Load the results from a modeled dataset. |
|
Save modeling results to csv format. |
|
Load keypoint tracking results from one or more files. |
|
Save a dict of pytrees to an hdf5 file. |
|
Load a dict of pytrees from an hdf5 file. |
- keypoint_moseq.io.generate_config(project_dir, **kwargs)[source]
Generate a config.yml file with project settings. Default settings will be used unless overriden by a keyword argument.
- Parameters
project_dir (str) – A file config.yml will be generated in this directory.
kwargs – Custom project settings.
- keypoint_moseq.io.check_config_validity(config)[source]
Check if the config is valid.
- To be valid, the config must satisfy the following criteria:
All the elements of config[“use_bodyparts”] are also in config[“bodyparts”]
All the elements of config[“anterior_bodyparts”] are also in config[“use_bodyparts”]
All the elements of config[“anterior_bodyparts”] are also in config[“use_bodyparts”]
For each pair in config[“skeleton”], both elements also in config[“bodyparts”]
- Parameters
config (dict) –
- Returns
validity
- Return type
bool
- keypoint_moseq.io.load_config(project_dir, check_if_valid=True, build_indexes=True)[source]
Load a project config file.
- Parameters
project_dir (str) – Directory containing the config file
check_if_valid (bool, default=True) – Check if the config is valid using
keypoint_moseq.io.check_config_validity()
build_indexes (bool, default=True) – Add keys “anterior_idxs” and “posterior_idxs” to the config. Each maps to a jax array indexing the elements of config[“anterior_bodyparts”] and config[“posterior_bodyparts”] by their order in config[“use_bodyparts”]
- Returns
config
- Return type
dict
- keypoint_moseq.io.update_config(project_dir, **kwargs)[source]
Update the config file stored at project_dir/config.yml.
Use keyword arguments to update key/value pairs in the config. To update model hyperparameters, just use the name of the hyperparameter as the keyword argument.
Examples
To update video_dir to /path/to/videos:
>>> update_config(project_dir, video_dir='/path/to/videos') >>> print(load_config(project_dir)['video_dir']) /path/to/videos
To update trans_hypparams[‘kappa’] to 100:
>>> update_config(project_dir, kappa=100) >>> print(load_config(project_dir)['trans_hypparams']['kappa']) 100
- keypoint_moseq.io.setup_project(project_dir, deeplabcut_config=None, sleap_file=None, nwb_file=None, overwrite=False, **options)[source]
Setup a project directory with the following structure:
project_dir └── config.yml
- Parameters
project_dir (str) – Path to the project directory (relative or absolute)
deeplabcut_config (str, default=None) – Path to a deeplabcut config file. Will be used to initialize bodyparts, skeleton, use_bodyparts and video_dir in the keypoint MoSeq config. (overrided by kwargs).
sleap_file (str, default=None) – Path to a .hdf5 or .slp file containing predictions for one video. Will be used to initialize bodyparts, skeleton, and use_bodyparts in the keypoint MoSeq config. (overrided by kwargs).
nwb_file (str, default=None) – Path to a .nwb file containing predictions for one video. Will be used to initialize bodyparts, skeleton, and use_bodyparts in the keypoint MoSeq config. (overrided by kwargs).
overwrite (bool, default=False) – Overwrite any config.yml that already exists at the path {project_dir}/config.yml.
options – Used to initialize config file. Overrides default settings.
- keypoint_moseq.io.save_pca(pca, project_dir, pca_path=None)[source]
Save a PCA model to disk.
The model is saved to pca_path or else to {project_dir}/pca.p.
- Parameters
pca (
sklearn.decomposition.PCA
) –project_dir (str) –
pca_path (str, default=None) –
- keypoint_moseq.io.load_pca(project_dir, pca_path=None)[source]
Load a PCA model from disk.
The model is loaded from pca_path or else from {project_dir}/pca.p.
- Parameters
project_dir (str) –
pca_path (str, default=None) –
- Returns
pca
- Return type
- keypoint_moseq.io.load_checkpoint(project_dir=None, model_name=None, path=None, iteration=None)[source]
Load data and model snapshot from a saved checkpoint.
The checkpoint path can be specified directly via path or else it is assumed to be {project_dir}/{model_name}/checkpoint.h5.
- Parameters
project_dir (str, default=None) – Project directory; used in conjunction with model_name to determine the checkpoint path if path is not specified.
model_name (str, default=None) – Model name; used in conjunction with project_dir to determine the checkpoint path if path is not specified.
path (str, default=None) – Checkpoint path; if not specified, the checkpoint path is set to {project_dir}/{model_name}/checkpoint.h5.
iteration (int, default=None) – Determines which model snapshot to load. If None, the last snapshot is loaded.
- Returns
model (dict) – Model dictionary containing states, parameters, hyperparameters, noise prior, and random seed.
data (dict) – Data dictionary containing observations, confidences, mask and associated metadata (see
keypoint_moseq.util.format_data()
).metadata (tuple (keys, bounds)) – Recordings and start/end frames for the data (see
keypoint_moseq.util.format_data()
).iteration (int) – Iteration of model fitting corresponding to the loaded snapshot.
- keypoint_moseq.io.reindex_syllables_in_checkpoint(project_dir=None, model_name=None, path=None, index=None, runlength=True)[source]
Reindex syllable labels by their frequency in the most recent model snapshot in a checkpoint file.
This is an in-place operation: the checkpoint is loaded from disk, modified and saved to disk again. The label permutation is applied to all model snapshots in the checkpoint.
The checkpoint path can be specified directly via path or else it is assumed to be {project_dir}/{model_name}/checkpoint.h5.
- Parameters
project_dir (str, default=None) –
model_name (str, default=None) –
path (str, default=None) –
index (array of shape (num_states,), default=None) – Permutation for syllable labels, where index[i] is relabled as i. If None, syllables are relabled by frequency, with the most frequent syllable relabled as 0, and so on.
runlength (bool, default=True) – If True, frequencies are quantified using the number of non-consecutive occurrences of each syllable. If False, frequency is quantified by total number of frames.
- Returns
index – The index used for permuting syllable labels. If index[i] = j, then the syllable formerly labeled j is now labeled i.
- Return type
array of shape (num_states,)
- keypoint_moseq.io.extract_results(model, metadata, project_dir=None, model_name=None, save_results=True, path=None)[source]
Extract model outputs and [optionally] save them to disk.
Model outputs are saved to disk as a .h5 file, either at path if it is specified, or at {project_dir}/{model_name}/results.h5 if it is not. If a .h5 file with the given path already exists, the outputs will be added to it. The results have the following structure:
results.h5 ├──recording_name1 │ ├──syllable # model state sequence (z), shape=(num_timepoints,) │ ├──latent_state # model latent state (x), shape=(num_timepoints,latent_dim) │ ├──centroid # model centroid (v), shape=(num_timepoints,keypoint_dim) │ └──heading # model heading (h), shape=(num_timepoints,) ⋮
- Parameters
model (dict) – Model dictionary containing states, parameters, hyperparameters, noise prior, and random seed.
metadata (tuple (keys, bounds)) – Recordings and start/end frames for the data (see
keypoint_moseq.util.format_data()
).save_results (bool, default=True) – If True, the model outputs will be saved to disk.
project_dir (str, default=None) – Path to the project directory. Required if save_results=True and results_path=None.
model_name (str, default=None) – Name of the model. Required if save_results=True and results_path=None.
path (str, default=None) – Optional path for saving model outputs.
- Returns
results_dict – Dictionary of model outputs with the same structure as the results .h5 file.
- Return type
dict
- keypoint_moseq.io.load_results(project_dir=None, model_name=None, path=None)[source]
Load the results from a modeled dataset.
The results path can be specified directly via path. Otherwise it is assumed to be {project_dir}/{model_name}/results.h5.
- Parameters
project_dir (str, default=None) –
model_name (str, default=None) –
path (str, default=None) –
- Returns
results – See
keypoint_moseq.fitting.apply_model()
- Return type
dict
- keypoint_moseq.io.save_results_as_csv(results, project_dir=None, model_name=None, save_dir=None, path_sep='-')[source]
Save modeling results to csv format.
This function creates a directory and then saves a separate csv file for each recording. The directory is created at save_dir if provided, otherwise at {project_dir}/{model_name}/results.
- Parameters
results (dict) – See
keypoint_moseq.io.extract_results()
.project_dir (str, default=None) – Project directory; required if save_dir is not provided.
model_name (str, default=None) – Name of the model; required if save_dir is not provided.
save_dir (str, default=None) – Optional path to the directory where the csv files will be saved.
path_sep (str, default='-') – If a path separator (“/” or “”) is present in the recording name, it will be replaced with path_sep when saving the csv file.
- keypoint_moseq.io.load_keypoints(filepath_pattern, format, extension=None, recursive=True, path_sep='-', path_in_name=False, remove_extension=True)[source]
Load keypoint tracking results from one or more files. Several file formats are supported:
- deeplabcut
.csv and .h5/.hdf5 files generated by deeplabcut. For single-animal tracking, each file yields a single key/value pair in the returned coordinates and confidences dictionaries. For multi-animal tracking, a key/vaue pair will be generated for each tracked individual. For example the file two_mice.h5 with individuals “mouseA” and “mouseB” will yield the pair of keys ‘two_mice_mouseA’, ‘two_mice_mouseB’.
- sleap
.slp and .h5/.hdf5 files generated by sleap. For single-animal tracking, each file yields a single key/value pair in the returned coordinates and confidences dictionaries. For multi-animal tracking, a key/vaue pair will be generated for each track. For example a single file called two_mice.h5 will yield the pair of keys ‘two_mice_track0’, ‘two_mice_track1’.
- anipose
.csv files generated by anipose. Each file should contain five columns per keypoint (x,y,z,error,score), plus a last column with the frame number. The score column is used as the keypoint confidence.
- sleap-anipose
.h5/.hdf5 files generated by sleap-anipose. Each file should contain a dataset called ‘tracks’ with shape (n_frames, 1, n_keypoints, 3). If there is also a ‘point_scores’ dataset, it will be used as the keypoint confidence. Otherwise, the confidence will be set to 1.
- nwb
.nwb files (Neurodata Without Borders). Each file should contain exactly one PoseEstimation object (for multi-animal tracking, each animal should be stored in its own .nwb file). The PoseEstimation object should contain one PoseEstimationSeries object for each bodypart. Confidence values are optional and will be set to 1 if not present.
- facemap
.h5 files saved by Facemap. See Facemap documentation for details: https://facemap.readthedocs.io/en/latest/outputs.html#keypoints-processing The files should have the format:
[filename].h5 └──Facemap ├──keypoint1 │ ├──x │ ├──y │ └──likelihood ⋮
- Parameters
filepath_pattern (str or list of str) –
Filepath pattern for a set of deeplabcut csv or hdf5 files, or a list of such patterns. Filepath patterns can be:
single file (e.g. /path/to/file.csv)
single directory (e.g. /path/to/dir/)
set of files (e.g. /path/to/fileprefix*)
set of directories (e.g. /path/to/dirprefix*)
format (str) – Format of the files to load. Must be one of deeplabcut, sleap, anipose, or sleap-anipose.
extension (str, default=None) –
File extension to use when searching for files. If None, then the extension will be inferred from the format argument:
sleap: ‘h5’ or ‘slp’
deeplabcut: ‘csv’ or ‘h5’
anipose: ‘csv’
sleap-anipose: ‘h5’
recursive (bool, default=True) – Whether to search recursively for deeplabcut csv or hdf5 files.
path_in_name (bool, default=False) – Whether to name the tracking results from each file by the path to the file (True) or just the filename (False). If True, the path_sep argument is used to separate the path components.
path_sep (str, default='-') – Separator to use when path_in_name is True. For example, if path_sep is ‘-’, then the tracking results from the file /path/to/file.csv will be named path-to-file. Using ‘/’ as the separator is discouraged, as it will cause problems saving/loading the modeling results to/from hdf5 files.
remove_extension (bool, default=True) – Whether to remove the file extension when naming the tracking results from each file.
- Returns
coordinates (dict) – Dictionary mapping filenames to keypoint coordinates as ndarrays of shape (n_frames, n_bodyparts, 2[or 3])
confidences (dict) – Dictionary mapping filenames to likelihood scores as ndarrays of shape (n_frames, n_bodyparts)
bodyparts (list of str) – List of bodypart names. The order of the names matches the order of the bodyparts in coordinates and confidences.
- keypoint_moseq.io.save_hdf5(filepath, save_dict, datapath=None)[source]
Save a dict of pytrees to an hdf5 file. The leaves of the pytrees must be numpy arrays, scalars, or strings.
- Parameters
filepath (str) – Path of the hdf5 file to create.
save_dict (dict) – Dictionary where the values are pytrees, i.e. recursive collections of tuples, lists, dicts, and numpy arrays.
datapath (str, default=None) – Path within the hdf5 file to save the data. If None, the data is saved at the root of the hdf5 file.
- keypoint_moseq.io.load_hdf5(filepath, datapath=None)[source]
Load a dict of pytrees from an hdf5 file.
- Parameters
filepath (str) – Path of the hdf5 file to load.
datapath (str, default=None) – Path within the hdf5 file to load the data from. If None, the data is loaded from the root of the hdf5 file.
- Returns
save_dict – Dictionary where the values are pytrees, i.e. recursive collections of tuples, lists, dicts, and numpy arrays.
- Return type
dict