Input/Output

Functions:

generate_config(project_dir, **kwargs)

Generate a config.yml file with project settings.

check_config_validity(config)

Check if the config is valid.

load_config(project_dir[, check_if_valid, ...])

Load a project config file.

update_config(project_dir, **kwargs)

Update the config file stored at project_dir/config.yml.

setup_project(project_dir[, ...])

Setup a project directory with the following structure.

save_pca(pca, project_dir[, pca_path])

Save a PCA model to disk.

load_pca(project_dir[, pca_path])

Load a PCA model from disk.

load_checkpoint([project_dir, model_name, ...])

Load data and model snapshot from a saved checkpoint.

reindex_syllables_in_checkpoint([...])

Reindex syllable labels by their frequency in the most recent model snapshot in a checkpoint file.

extract_results(model, metadata[, ...])

Extract model outputs and [optionally] save them to disk.

load_results([project_dir, model_name, path])

Load the results from a modeled dataset.

save_results_as_csv(results[, project_dir, ...])

Save modeling results to csv format.

load_keypoints(filepath_pattern, format[, ...])

Load keypoint tracking results from one or more files.

save_hdf5(filepath, save_dict[, datapath])

Save a dict of pytrees to an hdf5 file.

load_hdf5(filepath[, datapath])

Load a dict of pytrees from an hdf5 file.

keypoint_moseq.io.generate_config(project_dir, **kwargs)[source]

Generate a config.yml file with project settings. Default settings will be used unless overriden by a keyword argument.

Parameters
  • project_dir (str) – A file config.yml will be generated in this directory.

  • kwargs – Custom project settings.

keypoint_moseq.io.check_config_validity(config)[source]

Check if the config is valid.

To be valid, the config must satisfy the following criteria:
  • All the elements of config[“use_bodyparts”] are also in config[“bodyparts”]

  • All the elements of config[“anterior_bodyparts”] are also in config[“use_bodyparts”]

  • All the elements of config[“anterior_bodyparts”] are also in config[“use_bodyparts”]

  • For each pair in config[“skeleton”], both elements also in config[“bodyparts”]

Parameters

config (dict) –

Returns

validity

Return type

bool

keypoint_moseq.io.load_config(project_dir, check_if_valid=True, build_indexes=True)[source]

Load a project config file.

Parameters
  • project_dir (str) – Directory containing the config file

  • check_if_valid (bool, default=True) – Check if the config is valid using keypoint_moseq.io.check_config_validity()

  • build_indexes (bool, default=True) – Add keys “anterior_idxs” and “posterior_idxs” to the config. Each maps to a jax array indexing the elements of config[“anterior_bodyparts”] and config[“posterior_bodyparts”] by their order in config[“use_bodyparts”]

Returns

config

Return type

dict

keypoint_moseq.io.update_config(project_dir, **kwargs)[source]

Update the config file stored at project_dir/config.yml.

Use keyword arguments to update key/value pairs in the config. To update model hyperparameters, just use the name of the hyperparameter as the keyword argument.

Examples

To update video_dir to /path/to/videos:

>>> update_config(project_dir, video_dir='/path/to/videos')
>>> print(load_config(project_dir)['video_dir'])
/path/to/videos

To update trans_hypparams[‘kappa’] to 100:

>>> update_config(project_dir, kappa=100)
>>> print(load_config(project_dir)['trans_hypparams']['kappa'])
100
keypoint_moseq.io.setup_project(project_dir, deeplabcut_config=None, sleap_file=None, nwb_file=None, overwrite=False, **options)[source]

Setup a project directory with the following structure:

project_dir
└── config.yml
Parameters
  • project_dir (str) – Path to the project directory (relative or absolute)

  • deeplabcut_config (str, default=None) – Path to a deeplabcut config file. Will be used to initialize bodyparts, skeleton, use_bodyparts and video_dir in the keypoint MoSeq config. (overrided by kwargs).

  • sleap_file (str, default=None) – Path to a .hdf5 or .slp file containing predictions for one video. Will be used to initialize bodyparts, skeleton, and use_bodyparts in the keypoint MoSeq config. (overrided by kwargs).

  • nwb_file (str, default=None) – Path to a .nwb file containing predictions for one video. Will be used to initialize bodyparts, skeleton, and use_bodyparts in the keypoint MoSeq config. (overrided by kwargs).

  • overwrite (bool, default=False) – Overwrite any config.yml that already exists at the path {project_dir}/config.yml.

  • options – Used to initialize config file. Overrides default settings.

keypoint_moseq.io.save_pca(pca, project_dir, pca_path=None)[source]

Save a PCA model to disk.

The model is saved to pca_path or else to {project_dir}/pca.p.

Parameters
keypoint_moseq.io.load_pca(project_dir, pca_path=None)[source]

Load a PCA model from disk.

The model is loaded from pca_path or else from {project_dir}/pca.p.

Parameters
  • project_dir (str) –

  • pca_path (str, default=None) –

Returns

pca

Return type

sklearn.decomposition.PCA

keypoint_moseq.io.load_checkpoint(project_dir=None, model_name=None, path=None, iteration=None)[source]

Load data and model snapshot from a saved checkpoint.

The checkpoint path can be specified directly via path or else it is assumed to be {project_dir}/{model_name}/checkpoint.h5.

Parameters
  • project_dir (str, default=None) – Project directory; used in conjunction with model_name to determine the checkpoint path if path is not specified.

  • model_name (str, default=None) – Model name; used in conjunction with project_dir to determine the checkpoint path if path is not specified.

  • path (str, default=None) – Checkpoint path; if not specified, the checkpoint path is set to {project_dir}/{model_name}/checkpoint.h5.

  • iteration (int, default=None) – Determines which model snapshot to load. If None, the last snapshot is loaded.

Returns

  • model (dict) – Model dictionary containing states, parameters, hyperparameters, noise prior, and random seed.

  • data (dict) – Data dictionary containing observations, confidences, mask and associated metadata (see keypoint_moseq.util.format_data()).

  • metadata (tuple (keys, bounds)) – Recordings and start/end frames for the data (see keypoint_moseq.util.format_data()).

  • iteration (int) – Iteration of model fitting corresponding to the loaded snapshot.

keypoint_moseq.io.reindex_syllables_in_checkpoint(project_dir=None, model_name=None, path=None, index=None, runlength=True)[source]

Reindex syllable labels by their frequency in the most recent model snapshot in a checkpoint file.

This is an in-place operation: the checkpoint is loaded from disk, modified and saved to disk again. The label permutation is applied to all model snapshots in the checkpoint.

The checkpoint path can be specified directly via path or else it is assumed to be {project_dir}/{model_name}/checkpoint.h5.

Parameters
  • project_dir (str, default=None) –

  • model_name (str, default=None) –

  • path (str, default=None) –

  • index (array of shape (num_states,), default=None) – Permutation for syllable labels, where index[i] is relabled as i. If None, syllables are relabled by frequency, with the most frequent syllable relabled as 0, and so on.

  • runlength (bool, default=True) – If True, frequencies are quantified using the number of non-consecutive occurrences of each syllable. If False, frequency is quantified by total number of frames.

Returns

index – The index used for permuting syllable labels. If index[i] = j, then the syllable formerly labeled j is now labeled i.

Return type

array of shape (num_states,)

keypoint_moseq.io.extract_results(model, metadata, project_dir=None, model_name=None, save_results=True, path=None)[source]

Extract model outputs and [optionally] save them to disk.

Model outputs are saved to disk as a .h5 file, either at path if it is specified, or at {project_dir}/{model_name}/results.h5 if it is not. If a .h5 file with the given path already exists, the outputs will be added to it. The results have the following structure:

results.h5
├──recording_name1
│  ├──syllable      # model state sequence (z), shape=(num_timepoints,)
│  ├──latent_state  # model latent state (x), shape=(num_timepoints,latent_dim)
│  ├──centroid      # model centroid (v), shape=(num_timepoints,keypoint_dim)
│  └──heading       # model heading (h), shape=(num_timepoints,)
⋮
Parameters
  • model (dict) – Model dictionary containing states, parameters, hyperparameters, noise prior, and random seed.

  • metadata (tuple (keys, bounds)) – Recordings and start/end frames for the data (see keypoint_moseq.util.format_data()).

  • save_results (bool, default=True) – If True, the model outputs will be saved to disk.

  • project_dir (str, default=None) – Path to the project directory. Required if save_results=True and results_path=None.

  • model_name (str, default=None) – Name of the model. Required if save_results=True and results_path=None.

  • path (str, default=None) – Optional path for saving model outputs.

Returns

results_dict – Dictionary of model outputs with the same structure as the results .h5 file.

Return type

dict

keypoint_moseq.io.load_results(project_dir=None, model_name=None, path=None)[source]

Load the results from a modeled dataset.

The results path can be specified directly via path. Otherwise it is assumed to be {project_dir}/{model_name}/results.h5.

Parameters
  • project_dir (str, default=None) –

  • model_name (str, default=None) –

  • path (str, default=None) –

Returns

results – See keypoint_moseq.fitting.apply_model()

Return type

dict

keypoint_moseq.io.save_results_as_csv(results, project_dir=None, model_name=None, save_dir=None, path_sep='-')[source]

Save modeling results to csv format.

This function creates a directory and then saves a separate csv file for each recording. The directory is created at save_dir if provided, otherwise at {project_dir}/{model_name}/results.

Parameters
  • results (dict) – See keypoint_moseq.io.extract_results().

  • project_dir (str, default=None) – Project directory; required if save_dir is not provided.

  • model_name (str, default=None) – Name of the model; required if save_dir is not provided.

  • save_dir (str, default=None) – Optional path to the directory where the csv files will be saved.

  • path_sep (str, default='-') – If a path separator (“/” or “”) is present in the recording name, it will be replaced with path_sep when saving the csv file.

keypoint_moseq.io.load_keypoints(filepath_pattern, format, extension=None, recursive=True, path_sep='-', path_in_name=False, remove_extension=True)[source]

Load keypoint tracking results from one or more files. Several file formats are supported:

  • deeplabcut

    .csv and .h5/.hdf5 files generated by deeplabcut. For single-animal tracking, each file yields a single key/value pair in the returned coordinates and confidences dictionaries. For multi-animal tracking, a key/vaue pair will be generated for each tracked individual. For example the file two_mice.h5 with individuals “mouseA” and “mouseB” will yield the pair of keys ‘two_mice_mouseA’, ‘two_mice_mouseB’.

  • sleap

    .slp and .h5/.hdf5 files generated by sleap. For single-animal tracking, each file yields a single key/value pair in the returned coordinates and confidences dictionaries. For multi-animal tracking, a key/vaue pair will be generated for each track. For example a single file called two_mice.h5 will yield the pair of keys ‘two_mice_track0’, ‘two_mice_track1’.

  • anipose

    .csv files generated by anipose. Each file should contain five columns per keypoint (x,y,z,error,score), plus a last column with the frame number. The score column is used as the keypoint confidence.

  • sleap-anipose

    .h5/.hdf5 files generated by sleap-anipose. Each file should contain a dataset called ‘tracks’ with shape (n_frames, 1, n_keypoints, 3). If there is also a ‘point_scores’ dataset, it will be used as the keypoint confidence. Otherwise, the confidence will be set to 1.

  • nwb

    .nwb files (Neurodata Without Borders). Each file should contain exactly one PoseEstimation object (for multi-animal tracking, each animal should be stored in its own .nwb file). The PoseEstimation object should contain one PoseEstimationSeries object for each bodypart. Confidence values are optional and will be set to 1 if not present.

  • facemap

    .h5 files saved by Facemap. See Facemap documentation for details: https://facemap.readthedocs.io/en/latest/outputs.html#keypoints-processing The files should have the format:

    [filename].h5
    └──Facemap
        ├──keypoint1
        │  ├──x
        │  ├──y
        │  └──likelihood
        ⋮
    
Parameters
  • filepath_pattern (str or list of str) –

    Filepath pattern for a set of deeplabcut csv or hdf5 files, or a list of such patterns. Filepath patterns can be:

    • single file (e.g. /path/to/file.csv)

    • single directory (e.g. /path/to/dir/)

    • set of files (e.g. /path/to/fileprefix*)

    • set of directories (e.g. /path/to/dirprefix*)

  • format (str) – Format of the files to load. Must be one of deeplabcut, sleap, anipose, or sleap-anipose.

  • extension (str, default=None) –

    File extension to use when searching for files. If None, then the extension will be inferred from the format argument:

    • sleap: ‘h5’ or ‘slp’

    • deeplabcut: ‘csv’ or ‘h5’

    • anipose: ‘csv’

    • sleap-anipose: ‘h5’

  • recursive (bool, default=True) – Whether to search recursively for deeplabcut csv or hdf5 files.

  • path_in_name (bool, default=False) – Whether to name the tracking results from each file by the path to the file (True) or just the filename (False). If True, the path_sep argument is used to separate the path components.

  • path_sep (str, default='-') – Separator to use when path_in_name is True. For example, if path_sep is ‘-’, then the tracking results from the file /path/to/file.csv will be named path-to-file. Using ‘/’ as the separator is discouraged, as it will cause problems saving/loading the modeling results to/from hdf5 files.

  • remove_extension (bool, default=True) – Whether to remove the file extension when naming the tracking results from each file.

Returns

  • coordinates (dict) – Dictionary mapping filenames to keypoint coordinates as ndarrays of shape (n_frames, n_bodyparts, 2[or 3])

  • confidences (dict) – Dictionary mapping filenames to likelihood scores as ndarrays of shape (n_frames, n_bodyparts)

  • bodyparts (list of str) – List of bodypart names. The order of the names matches the order of the bodyparts in coordinates and confidences.

keypoint_moseq.io.save_hdf5(filepath, save_dict, datapath=None)[source]

Save a dict of pytrees to an hdf5 file. The leaves of the pytrees must be numpy arrays, scalars, or strings.

Parameters
  • filepath (str) – Path of the hdf5 file to create.

  • save_dict (dict) – Dictionary where the values are pytrees, i.e. recursive collections of tuples, lists, dicts, and numpy arrays.

  • datapath (str, default=None) – Path within the hdf5 file to save the data. If None, the data is saved at the root of the hdf5 file.

keypoint_moseq.io.load_hdf5(filepath, datapath=None)[source]

Load a dict of pytrees from an hdf5 file.

Parameters
  • filepath (str) – Path of the hdf5 file to load.

  • datapath (str, default=None) – Path within the hdf5 file to load the data from. If None, the data is loaded from the root of the hdf5 file.

Returns

save_dict – Dictionary where the values are pytrees, i.e. recursive collections of tuples, lists, dicts, and numpy arrays.

Return type

dict