Decoding pipeline edits for imported data (LorenFrankLab#782)

* Add non-local detector and remove replay_trajectory_classification * Reorganize * Fix formatting and imports * Update .gitignore * Remove because of circular import * Fix name of parameter * Handle case where ther is only one interval * Fix settings * Handle single interval * from_unit_dict does not exist in 0.98.2 of spike interface * Simplify call * Update for SpikeSorting merge table and add spyglass mixin * Fix dependencies * Fix merge conflict * Update src/spyglass/decoding/v1/clusterless.py Co-authored-by: Chris Brozdowski <[email protected]> * Update src/spyglass/decoding/v1/clusterless.py Co-authored-by: Chris Brozdowski <[email protected]> * Update src/spyglass/decoding/v1/clusterless.py Co-authored-by: Chris Brozdowski <[email protected]> * Update src/spyglass/decoding/v1/clusterless.py Co-authored-by: Chris Brozdowski <[email protected]> * Apply suggestions from code review Co-authored-by: Chris Brozdowski <[email protected]> * Remove unused imports and format * Add saving of waveform features * Don't store electrodes, full waveforms, waveform mean * Fix spike times and add convenience method * Add spike location and some formatting * Remove circular import * Fix dict expansion * Initial working clusterless pipeline * Add position group * Rename classifier to decoding * Handle encoding and decoding intervals * Put old files under v0, try/except for old decoding package * Rename visualization and remove from v0 v0 visualization is redundant with visualization * Place parameters and position group in core.py * Add sorted spikes decoding * Add objects to init for convenience * Remove unused imports * Fix fetching of spike times * Insert into merge table * Update CHANGELOG.md * Function for removing decoding outputs not in DecodingOutput * Fix name * Add draft of tutorials and rearrange notebooks * Fix config loading * Add 1D decoding and some notes on estimate_parameters kwarg * Update 43_Decoding_SortedSpikes.ipynb * Remove old decoding notebook * Save initial conditions and discrete transitions * Apply suggestions from code review Co-authored-by: Chris Brozdowski <[email protected]> * Be more specific with import error * Remove unneeded comments * Remove incorrect dimension name * Project merge_id from SpikeSortingOutput for clarity * Update src/spyglass/decoding/v0/clusterless.py Co-authored-by: Chris Brozdowski <[email protected]> * Update src/spyglass/decoding/v0/clusterless.py Co-authored-by: Chris Brozdowski <[email protected]> * Update src/spyglass/decoding/v0/clusterless.py Co-authored-by: Chris Brozdowski <[email protected]> * Fix linting * Update notebooks * Ignore .pem * Add session as a primary key for Groups * Add some helper methods * Update notebooks * Update README.md * Update pyscripts * Update 42_Decoding_Clusterless.ipynb * Update CHANGELOG.md * Add fetch and insert * Simplify class conversion * Do the dictionary conversion of class for the user * Update CHANGELOG.md * Update .gitignore * Use methods in populate * Avoid fetching interval range if not needed * Generalize finding class from modules * Use args/kwargs * Simplify tuple unpacking * Make decoding kwargs nullable * Add function for get_recording and get_sorting to the spikesorting merge table * make decoding waveform features agnostic to spikesorting source * Fix spelling * Use fetch1_dataframe for position * Use self instead of class * Update src/spyglass/decoding/v1/sorted_spikes.py Co-authored-by: Samuel Bray <[email protected]> * Be more careful about populating select keys * Make more readable/remove unused imports * Save classifier * Clean up saved model paths * add function load_linear_position_info * Update src/spyglass/decoding/v1/sorted_spikes.py Co-authored-by: Samuel Bray <[email protected]> * Update 41_Extracting_Clusterless_Waveform_Features.py * Update docstring * Apply suggestions from code review Co-authored-by: Chris Brozdowski <[email protected]> * Update src/spyglass/decoding/v1/clusterless.py Co-authored-by: Chris Brozdowski <[email protected]> * Update src/spyglass/decoding/v1/clusterless.py Co-authored-by: Chris Brozdowski <[email protected]> * Fix linting * Fix syntax * Rename variable to avoid confusion * Restrict UnitWaveformFeaturesGroup and SortedSpikesGroup * Concatenate linear position and position dataframes * Static methods don't require instantiating class * Avoid merge restrict * Add version to defaults * Remove unused import * Fix classifier path * Add dry run * Remove non-default * Handle permissions and file not found * Keep position info within encoding/decoding interval * Add methods to get the spike_times, spike_indicators, firing rate * Fix docstring to match default * Implement function rather than import * Remove unused broken imports * Add decoding cleanup * Fix import * Put old vis code back * Fix import * Add draft helper functions * Limit options on input * Fix logic * Fix where the key is passed * Update notebooks * Host main visualizations in non_local_detector repo * Update notebooks/py_scripts/41_Extracting_Clusterless_Waveform_Features.py Co-authored-by: Chris Brozdowski <[email protected]> * Update src/spyglass/spikesorting/merge.py Co-authored-by: Chris Brozdowski <[email protected]> * Update src/spyglass/decoding/decoding_merge.py Co-authored-by: Chris Brozdowski <[email protected]> * Revert "Limit options on input" This reverts commit 386714c. * Use f-string for version * Add useful imports to the top level This would have to change a bit if there were multiple versions of the pipeline. * Make source class a hidden attribute * Update CHANGELOG.md * Centralize get_class logic in Merge (LorenFrankLab#749) * get_class logic -> dj_merge * blackify --------- Co-authored-by: Eric Denovellis <[email protected]> * Add _nwb_table for fetch_nwb * Method is static method * Add merge insert * string split is brittle, use defaults if it didn't work * WIP: Mixin resolves _nwb_table attr for Merge (LorenFrankLab#783) * Change import * Handle single position 3D case * Fix getting source from key * Use merge_restrict_class for fetch_nwb on merge tables * Move this back * Remove for now * Temp patch for tests * Revert "Temp patch for tests" This reverts commit 281bf36. * Temp patch for tests * Handle None decoding kwargs * fetch_nwb is a method not a class method now * Fix _merge_repr for numeric data types (LorenFrankLab#786) * Easily calculate firing rate * Add sorting spike times by place field and ahead behind distance * Account for differently named position variables * Handle orientation name and fix linear position fetch * Fix 2D ahead/behind * Add `UnitSelection` table (LorenFrankLab#788) * Add UnitSelection * Rename table * Addressing LorenFrankLab#789, failing tests (LorenFrankLab#795) * Addressing LorenFrankLab#789. See details. - Edit settings.py to permit fail to load config on startup - Simplify instructions to use func to generate config, negating need for config.py script - Edit notebook 00 and installation doc to point to save-config func - Expand example config to demonstrate all possible values - Lint dj_merge_table.py to remove unused import and `var is True` * LorenFrankLab#794 * Fix failing tests * Update Changelog * Typo: chared -> shared * Apply suggestions from code review Co-authored-by: Chris Brozdowski <[email protected]> * Fix formatting * Update src/spyglass/common/common_behav.py Co-authored-by: Chris Brozdowski <[email protected]> * Update src/spyglass/settings.py Co-authored-by: Chris Brozdowski <[email protected]> * Fix syntax * Update merge.py --------- Co-authored-by: Chris Brozdowski <[email protected]> Co-authored-by: Sam Bray <[email protected]> Co-authored-by: Kyu Hyun Lee <[email protected]>
CBroz1 · Jan 25, 2024 · c5bf75b · c5bf75b
1 parent 1cc4b32
commit c5bf75b
Show file tree

Hide file tree

Showing 31 changed files with 1,017 additions and 718 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -20,7 +20,7 @@
 ### Pipelines
 
 - Spike sorting: Add SpikeSorting V1 pipeline. #651
-- LFP: Minor fixes to LFPBandV1 populator. #706
+- LFP: Minor fixes to LFPBandV1 populator and `make`. #706, #795
 - Linearization:
     - Minor fixes to LinearizedPositionV1 pipeline #695
     - Rename `position_linearization` -> `linearization`. #717
@@ -34,6 +34,7 @@
     - Use the new `non_local_detector` package for decoding #731
     - Allow multiple spike waveform features for clusterelss decoding #731
     - Reorder notebooks #731
+    - Add fetch class functionality to `Merge` table. #783, #786
 
 ## [0.4.3] (November 7, 2023)
 

diff --git a/dj_local_conf_example.json b/dj_local_conf_example.json
@@ -14,7 +14,7 @@
   "display.show_tuple_count": true,
   "database.use_tls": null,
   "enable_python_native_blobs": true,
-  "filepath_checksum_size_limit": null,
+  "filepath_checksum_size_limit": 1073741824,
   "stores": {
     "raw": {
       "protocol": "file",
@@ -28,14 +28,28 @@
     }
   },
   "custom": {
+    "debug_mode": "false",
+    "test_mode": "false",
     "spyglass_dirs": {
-      "base": "/your/path/like/stelmo/nwb/"
+      "base": "/your/base/path",
+      "raw": "/your/base/path/raw",
+      "analysis": "/your/base/path/analysis",
+      "recording": "/your/base/path/recording",
+      "sorting": "/your/base/path/spikesorting",
+      "waveforms": "/your/base/path/waveforms",
+      "temp": "/your/base/path/tmp",
+      "video": "/your/base/path/video"
     },
     "kachery_dirs": {
-      "cloud": "/your/path/.kachery-cloud"
+      "cloud": "/your/base/path/kachery_storage",
+      "storage": "/your/base/path/kachery_storage",
+      "temp": "/your/base/path/tmp"
     },
     "dlc_dirs": {
-      "base": "/your/path/like/nimbus/deeplabcut/"
+      "base": "/your/base/path/deeplabcut",
+      "project": "/your/base/path/deeplabcut/projects",
+      "video": "/your/base/path/deeplabcut/video",
+      "output": "/your/base/path/deeplabcut/output"
     },
     "kachery_zone": "franklab.default"
   }

diff --git a/docs/src/index.md b/docs/src/index.md
@@ -2,46 +2,49 @@
 
 ![Figure 1](./images/fig1.png)
 
-**Spyglass** is an open-source software framework designed to offer reliable
-and reproducible analysis of neuroscience data and sharing of the results
-with collaborators and the broader community.
+**Spyglass** is an open-source software framework designed to offer reliable and
+reproducible analysis of neuroscience data and sharing of the results with
+collaborators and the broader community.
 
 Features of Spyglass include:
 
-+ **Standardized data storage** - Spyglass uses the open-source
-  [Neurodata Without Borders: Neurophysiology (NWB:N)](https://www.nwb.org/)
-  format to ingest and store processed data. NWB:N is a standard set by the BRAIN
-  Initiative for neurophysiological data ([Rübel et al., 2022](https://doi.org/10.7554/elife.78362)).
-+ **Reproducible analysis** - Spyglass uses [DataJoint](https://datajoint.com/)
-  to ensure that all analysis is reproducible. DataJoint is a data management
-  system that automatically tracks dependencies between data and analysis code. This
-  ensures that all analysis is reproducible and that the results are
-  automatically updated when the data or analysis code changes.
-+ **Common analysis tools** - Spyglass provides easy usage of the open-source packages
-  [SpikeInterface](https://github.com/SpikeInterface/spikeinterface),
-  [Ghostipy](https://github.com/kemerelab/ghostipy), and [DeepLabCut](https://github.com/DeepLabCut/DeepLabCut)
-  for common analysis tasks. These packages are well-documented and have active
-  developer communities.
-+ **Interactive data visualization** - Spyglass uses [figurl](https://github.com/flatironinstitute/figurl)
-  to create interactive data visualizations that can be shared with collaborators
-  and the broader community. These visualizations are hosted on the web
-  and can be viewed in any modern web browser. The interactivity allows users to
-  explore the data and analysis results in detail.
-+ **Sharing results** - Spyglass enables sharing of data and analysis results via
-  [Kachery](https://github.com/flatironinstitute/kachery-cloud), a
-  decentralized content addressable data sharing platform. Kachery Cloud allows
-  users to access the database and pull data and analysis results directly
-  to their local machine.
-+ **Pipeline versioning** - Processing and analysis of data in neuroscience is
-  often dynamic, requiring new features. Spyglass uses *Merge tables* to ensure that
-  analysis pipelines can be versioned. This allows users to easily use and compare
-  results from different versions of the analysis pipeline while retaining
-  the ability to access previously generated results.
-+ **Cautious Delete** - Spyglass uses a `cautious delete` feature to ensure
-  that data is not accidentally deleted by other users. When a user deletes data,
-  Spyglass will first check to see if the data belongs to another team of users.
-  This enables teams of users to work collaboratively on the same database without
-  worrying about accidentally deleting each other's data.
+- **Standardized data storage** - Spyglass uses the open-source
+    [Neurodata Without Borders: Neurophysiology (NWB:N)](https://www.nwb.org/)
+    format to ingest and store processed data. NWB:N is a standard set by the
+    BRAIN Initiative for neurophysiological data
+    ([Rübel et al., 2022](https://doi.org/10.7554/elife.78362)).
+- **Reproducible analysis** - Spyglass uses [DataJoint](https://datajoint.com/)
+    to ensure that all analysis is reproducible. DataJoint is a data management
+    system that automatically tracks dependencies between data and analysis
+    code. This ensures that all analysis is reproducible and that the results
+    are automatically updated when the data or analysis code changes.
+- **Common analysis tools** - Spyglass provides easy usage of the open-source
+    packages [SpikeInterface](https://github.com/SpikeInterface/spikeinterface),
+    [Ghostipy](https://github.com/kemerelab/ghostipy), and
+    [DeepLabCut](https://github.com/DeepLabCut/DeepLabCut) for common analysis
+    tasks. These packages are well-documented and have active developer
+    communities.
+- **Interactive data visualization** - Spyglass uses
+    [figurl](https://github.com/flatironinstitute/figurl) to create interactive
+    data visualizations that can be shared with collaborators and the broader
+    community. These visualizations are hosted on the web and can be viewed in
+    any modern web browser. The interactivity allows users to explore the data
+    and analysis results in detail.
+- **Sharing results** - Spyglass enables sharing of data and analysis results
+    via [Kachery](https://github.com/flatironinstitute/kachery-cloud), a
+    decentralized content addressable data sharing platform. Kachery Cloud
+    allows users to access the database and pull data and analysis results
+    directly to their local machine.
+- **Pipeline versioning** - Processing and analysis of data in neuroscience is
+    often dynamic, requiring new features. Spyglass uses *Merge tables* to
+    ensure that analysis pipelines can be versioned. This allows users to easily
+    use and compare results from different versions of the analysis pipeline
+    while retaining the ability to access previously generated results.
+- **Cautious Delete** - Spyglass uses a `cautious delete` feature to ensure that
+    data is not accidentally deleted by other users. When a user deletes data,
+    Spyglass will first check to see if the data belongs to another team of
+    users. This enables teams of users to work collaboratively on the same
+    database without worrying about accidentally deleting each other's data.
 
 ## Getting Started
 

diff --git a/docs/src/installation.md b/docs/src/installation.md
@@ -25,7 +25,7 @@ pip install spikeinterface[full,widgets]
 pip install mountainsort4
 ```
 
-WARNING: If you are on an M1 Mac, you need to install `pyfftw` via `conda`
+__WARNING:__ If you are on an M1 Mac, you need to install `pyfftw` via `conda`
 BEFORE installing `ghostipy`:
 
 ```bash
@@ -49,40 +49,30 @@ additional details, see the
 
 #### Via File (Recommended)
 
-A `dj_local_conf.json` file in your Spyglass directory (or wherever python is
-launched) can hold all the specifics needed to connect to a database. This can
-include different directories for different pipelines. If only the `base` is
-specified, the subfolder names below are included as defaults.
-
-```json
-{
-  "custom": {
-    "database.prefix": "username_",
-    "spyglass_dirs": {
-      "base": "/your/base/path",
-      "raw": "/your/base/path/raw",
-      "analysis": "/your/base/path/analysis",
-      "recording": "/your/base/path/recording",
-      "spike_sorting_storage": "/your/base/path/spikesorting",
-      "waveforms": "/your/base/path/waveforms",
-      "temp": "/your/base/path/tmp"
-    }
-  }
-}
-```
-
-`dj_local_conf_example.json` can be copied and saved as `dj_local_conf.json` to
-set the configuration for a given folder. Alternatively, it can be saved as
-`.datajoint_config.json` in a user's home directory to be accessed globally. See
+A `dj_local_conf.json` file in your current directory when launching python can
+hold all the specifics needed to connect to a database. This can include
+different directories for different pipelines. If only the Spyglass `base` is
+specified, other subfolder names are assumed from defaults. See
+`dj_local_conf_example.json` for the full set of options. This example can be
+copied and saved as `dj_local_conf.json` to set the configuration for a given
+folder. Alternatively, it can be saved as `.datajoint_config.json` in a user's
+home directory to be accessed globally. See
 [DataJoint docs](https://datajoint.com/docs/core/datajoint-python/0.14/quick-start/#connection)
 for more details.
 
+Note that raw and analysis folder locations should be specified under both
+`stores` and `custom` sections of the config file. The `stores` section is used
+by DataJoint to store the location of files referenced in database, while the
+`custom` section is used by Spyglass. Spyglass will check that these sections
+match on startup.
+
 #### Via Environment Variables
 
 Older versions of Spyglass relied exclusively on environment for config. If
 `spyglass_dirs` is not found in the config file, Spyglass will look for
 environment variables. These can be set either once in a terminal session, or
-permanently in a `.bashrc` file.
+permanently in a unix settings file (e.g., `.bashrc` or `.bash_profile`) in your
+home directory.
 
 ```bash
 export SPYGLASS_BASE_DIR="/stelmo/nwb"
@@ -102,14 +92,21 @@ A temporary directory will speed up spike sorting. If unspecified by either
 method above, it will be assumed as a `tmp` subfolder relative to the base path.
 Be sure it has enough free space (ideally at least 500GB).
 
+#### Subfolders
+
+If subfolders do not exist, they will be created automatically. If unspecified
+by either method above, they will be assumed as `recording`, `sorting`, `video`,
+etc. subfolders relative to the base path.
+
 ## File manager
 
 [`kachery-cloud`](https://github.com/flatironinstitute/kachery-cloud) is a file
 manager for Frank Lab collaborators who do not have access to the lab's
 production database.
 
-To customize `kachery` file paths, the following can similarly be pasted into
-your `.bashrc`. If unspecified, the defaults below are assumed.
+To customize `kachery` file paths, see `dj_local_conf_example.json` or set the
+following variables in your unix settings file (e.g., `.bashrc`). If
+unspecified, the defaults below are assumed.
 
 ```bash
 export KACHERY_CLOUD_DIR="$SPYGLASS_BASE_DIR/.kachery-cloud"
@@ -122,3 +119,9 @@ Be sure to load these with `source ~/.bashrc` to persist changes.
 
 Finally, open up a python console (e.g., run `ipython` from terminal) and import
 `spyglass` to check that the installation has worked.
+
+```python
+from spyglass.common import Nwbfile
+
+Nwbfile()
+```