-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Incorporate end user documentation (#50)
* incorporate end-user documentation sources * incorporate end-user documentation sources
- Loading branch information
Showing
15 changed files
with
689 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,5 @@ | ||
.env | ||
pbp-doc/site/ | ||
output*/ | ||
cloud_tmp*/ | ||
NRS11/ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
{ | ||
"destination": "pbp", | ||
"docdir": "pbp-doc", | ||
"public": true | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# README | ||
|
||
This directory contains the sources for documenting the use of | ||
[`mbari-org/pbp`](https://pypi.org/project/mbari-pbp/). | ||
|
||
Merging changes in this directory into the main branch in the remote repo | ||
will automatically trigger the update of the generated site at | ||
<https://docs.mbari.org/pbp/>. | ||
|
||
### Local doc development | ||
|
||
The following commands assume `pbp-doc` is the current directory. | ||
|
||
One-off setup: | ||
```bash | ||
just setup | ||
``` | ||
|
||
Then: | ||
```bash | ||
just serve | ||
``` | ||
and open the indicated URL in your browser. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
@import url(https://fonts.googleapis.com/css?family=Merriweather:400,300); | ||
@import url(https://docs.mbari.org/css/iosevka-custom/iosevka-custom.css); | ||
|
||
body { | ||
font-family: 'Merriweather', serif; | ||
font-weight: 300; | ||
} | ||
|
||
code, tt { | ||
font-family: 'Iosevka', 'Roboto Mono', monospace; | ||
font-weight: 400; | ||
font-variant-ligatures: none; | ||
} | ||
|
||
[data-md-color-scheme=slate] { | ||
/* more legible links: */ | ||
--md-typeset-a-color: #a6c1f1; | ||
} | ||
|
||
.md-content { | ||
/* so the content expands a bit, mainly for `program --help` outputs */ | ||
min-width: unset; | ||
} | ||
|
||
/* restrict block width to that of container */ | ||
.md-typeset pre code { | ||
max-width: 100%; | ||
display: inline-block; | ||
white-space: pre-wrap; | ||
overflow-x: scroll; | ||
word-wrap: break-word; | ||
padding: 1rem; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
var paletteSwitcher1 = document.getElementById("__palette_1"); | ||
var paletteSwitcher2 = document.getElementById("__palette_2"); | ||
|
||
paletteSwitcher1.addEventListener("change", function () { | ||
console.debug('change paletteSwitcher1=', paletteSwitcher1) | ||
location.reload(); | ||
}); | ||
|
||
paletteSwitcher2.addEventListener("change", function () { | ||
console.debug('change paletteSwitcher2=', paletteSwitcher2) | ||
location.reload(); | ||
}); |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
--- | ||
description: Process ocean audio data archives to daily analysis products of hybrid millidecade spectra using PyPAM. | ||
--- | ||
|
||
!!! note "WIP" | ||
Thanks for your interest in PBP. This documentation is still a work in progress :construction:. | ||
Please get in touch if you have any questions or suggestions. | ||
|
||
# MBARI PBP | ||
|
||
The [`mbari-pbp`](https://pypi.org/project/mbari-pbp/) package allows to | ||
process ocean audio data archives to daily analysis products of hybrid millidecade spectra using | ||
[PyPAM](https://github.com/lifewatch/pypam/). | ||
|
||
You can use PBP by directly running the included CLI the programs, | ||
as well as a dependency in your own Python code. | ||
|
||
**Features**: | ||
|
||
- [x] Audio metadata extraction for managed timekeeping | ||
- [x] Start and duration of recognized wav and flac sound files either locally or in cloud (JSON) | ||
- [x] Coverage plot of sound recordings | ||
- [x] Audio file processing | ||
- [x] Frequency and psd array output | ||
- [x] Concatenation of processed 1-minute segments for daily product | ||
- [x] Calibration with given sensitivity file (NetCDF), or flat sensitivity value | ||
- [x] Data products | ||
- [x] NetCDF with metadata | ||
- [x] Summary plot | ||
- [x] Cloud processing | ||
- [x] Inputs can be downloaded from and uploaded to S3 | ||
- [x] Inputs can be downloaded from public GCS bucket | ||
- [ ] Outputs can be uploaded to GCS | ||
|
||
## Installation | ||
|
||
On your environment the only requirement is Python 3.9, 3.10, or 3.11.[^1] | ||
Make sure your Python installation includes the `pip` and `venv` modules, | ||
or install them separately as needed. | ||
|
||
You can run `python3 --version` to check the version of Python installed. | ||
|
||
[^1]: As currently [required by PyPAM](https://github.com/lifewatch/pypam/blob/29e82f0c5c6ce43b457d76963cb9d82392740654/pyproject.toml#L16). | ||
|
||
As a general practice, it is recommended to use a virtual environment for the installation. | ||
```shell | ||
python3.9 -m venv virtenv | ||
source virtenv/bin/activate | ||
``` | ||
|
||
Install the package: | ||
```shell | ||
pip install mbari-pbp | ||
``` | ||
|
||
!!! note "" | ||
If you are upgrading from a previous version, you can use the following command: | ||
```shell | ||
pip install --upgrade mbari-pbp | ||
``` | ||
|
||
## Advanced Installation | ||
|
||
If you want to install the package from source and have already installed with the `pip install mbari-pbp` command, | ||
you can install the package from source with the following command. This will get the latest version :construction: from the main branch. | ||
|
||
```shell | ||
pip install --no-cache-dir --force-reinstall git+https://github.com/mbari-org/pbp.git | ||
``` | ||
|
||
## Programs | ||
|
||
The package includes the following CLI programs: | ||
|
||
| Program | Description | | ||
|---------------------------------|------------------------------------------------| | ||
| [`pbp-meta-gen`](pbp-meta-gen/) | Generate JSON files with audio metadata. | | ||
| [`pbp-hmb-gen`](pbp-hmb-gen/) | Main HMB generation program. | | ||
| [`pbp-cloud`](pbp-cloud/) | Program for cloud based processing. | | ||
| [`pbp-hmb-plot`](pbp-hmb-plot/) | Utility program to plot resulting HMB product. | | ||
|
||
|
||
## References | ||
|
||
- PyPAM - Python tool for Passive Acoustic Monitoring – | ||
<https://doi.org/10.5281/zenodo.6044593> | ||
- Computation of single-sided mean-square sound pressure spectral density with 1 Hz resolution follows | ||
ISO 18405 3.1.3.13 (International Standard ISO 18405:2017(E), Underwater Acoustics – Terminology. Geneva: ISO) | ||
– https://www.iso.org/standard/62406.html | ||
- Hybrid millidecade spectra: A practical format for exchange of long-term ambient sound data – | ||
<https://asa.scitation.org/doi/10.1121/10.0003324> | ||
- Erratum: Hybrid millidecade spectra – | ||
<https://asa.scitation.org/doi/10.1121/10.0005818> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
!!! note | ||
This is a placeholder for documenting the use of PBP in notebooks. | ||
|
||
# Notebooks | ||
|
||
- [PBP-NRS11-batch.ipynb](https://colab.research.google.com/drive/1RaFVZzdRt88gY1SR_J34XMdRLgBjEdI-) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
!!! note | ||
This is a placeholder for the documentation of the `pbp-cloud` command-line program. | ||
|
||
# Processing in the cloud | ||
|
||
|
||
## The `pbp-cloud` program | ||
TODO: proper description of the `pbp-cloud` program. | ||
|
||
For now, the following directly adapted from the source code: | ||
|
||
---- | ||
|
||
TODO Adjustments for GCS as the program is still only focused on S3. | ||
|
||
|
||
By cloud based processing we basically mean the ability | ||
to get input files (json and wav) from S3 and write output files to S3. | ||
|
||
All program parameters are to be passed via environment variables: | ||
|
||
- `DATE`: (Required) | ||
The date to process. Format: "YYYYMMDD". | ||
- `S3_JSON_BUCKET_PREFIX`: (Optional) | ||
Bucket prefix to be used to locate the YYYYMMDD.json file | ||
By default, `s3://pacific-sound-metadata/256khz`. | ||
- `S3_OUTPUT_BUCKET`: (Optional) | ||
The bucket to write the generated output to. | ||
Typically, this is to be provided, but it is optional to facilitate testing. | ||
- `OUTPUT_PREFIX`: (Optional) | ||
Output filename prefix. By default, `milli_psd_`. | ||
The resulting file will be named as `<OUTPUT_PREFIX><DATE>.nc`. | ||
- `GLOBAL_ATTRS_URI`: (Optional) | ||
URI of JSON file with global attributes to be added to the NetCDF file. | ||
- `VARIABLE_ATTRS_URI`: (Optional) | ||
URI of JSON file with attributes to associate with the variables in the NetCDF file. | ||
- `VOLTAGE_MULTIPLIER`: (Optional) | ||
Applied on the loaded signal. | ||
- `SENSITIVITY_NETCDF_URI`: (Optional) | ||
URI of sensitivity NetCDF file that should be used to calibrate the result. | ||
- `SENSITIVITY_FLAT_VALUE`: (Optional) | ||
Flat sensitivity value to be used for calibration | ||
if `SENSITIVITY_NETCDF_URI` is not given. | ||
- `SUBSET_TO`: (Required) Format: `lower,upper`. | ||
Subset the resulting PSD to `[lower, upper)`, in terms of central frequency. | ||
|
||
TODO: retrieve sensitivity information using PyHydrophone when none | ||
of the `SENSITIVITY_*` environment variables above are given. | ||
|
||
Mainly for testing purposes, also these environment variables are inspected: | ||
|
||
- `CLOUD_TMP_DIR`: (Optional) | ||
Local workspace for downloads and for generated files to be uploaded. | ||
By default, `cloud_tmp`. | ||
|
||
- `MAX_SEGMENTS`: (Optional) | ||
0, the default, means no restriction, that is, all segments for each day | ||
will be processed. | ||
|
||
- `ASSUME_DOWNLOADED_FILES`: (Optional) | ||
If "yes", then if any destination file for a download exists, | ||
it is assumed downloaded already. | ||
The default is that downloads are always performed. | ||
|
||
- `RETAIN_DOWNLOADED_FILES`: (Optional) | ||
If "yes", do not remove any downloaded files after use. | ||
The default is that any downloaded file is removed after use. | ||
|
||
|
||
## Running on AWS | ||
|
||
TODO: Describe how to run the program on AWS. | ||
|
||
## Running on GCP | ||
|
||
TODO: Describe how to run the program on GCP. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
!!! danger "WIP" | ||
|
||
# HMB Generation | ||
|
||
`pbp-hmb-gen` is the main program for generating the HMB product. | ||
It processes ocean audio data archives to daily analysis products of hybrid millidecade spectra using PyPAM. | ||
|
||
The program accepts several options. | ||
A typical use mainly involves the following: | ||
|
||
| Option | To indicate | | ||
| ----------------- |--------------- | | ||
| `--json-base-dir` | base directory for JSON files | | ||
| `--date` | date to be processed | | ||
| `--global-attrs` | URI of a YAML file with global attributes to be added to the NetCDF file | | ||
| `--variable-attrs`| URI of a YAML file with attributes to associate with the variables in the NetCDF file | | ||
| `--output-dir` | output directory | | ||
| `--output-prefix` | output filename prefix | | ||
| `--subset-to` | subset of the resulting PSD in terms of central frequency | | ||
|
||
Also, the following depending on the recorder: | ||
|
||
| Option | To indicate | | ||
| ------------------------ |--------------- | | ||
| `--voltage-multiplier` | applied on the loaded signal | | ||
| `--sensitivity-uri` | URI of sensitivity NetCDF for calibration of result | | ||
| `--sensitivity-flat-value`| flat sensitivity value to be used for calibration | | ||
|
||
|
||
## Usage | ||
|
||
```shell | ||
$ pbp-hmb-gen --help | ||
``` | ||
```text | ||
usage: pbp-hmb-gen [-h] [--version] --json-base-dir dir [--audio-base-dir dir] [--global-attrs uri] [--set-global-attr key value] [--variable-attrs uri] | ||
[--audio-path-map-prefix from~to] [--audio-path-prefix dir] --date YYYYMMDD [--voltage-multiplier value] [--sensitivity-uri file] | ||
[--sensitivity-flat-value value] --output-dir dir [--output-prefix prefix] [--s3] [--s3-unsigned] [--gs] [--download-dir dir] [--assume-downloaded-files] | ||
[--retain-downloaded-files] [--max-segments num] [--subset-to lower upper] | ||
Process ocean audio data archives to daily analysis products of hybrid millidecade spectra using PyPAM. | ||
optional arguments: | ||
-h, --help show this help message and exit | ||
--version show program's version number and exit | ||
--json-base-dir dir JSON base directory | ||
--audio-base-dir dir Audio base directory. By default, none | ||
--global-attrs uri URI of JSON file with global attributes to be added to the NetCDF file. | ||
--set-global-attr key value | ||
Replace {{key}} with the given value for every occurrence of {{key}} in the global attrs file. | ||
--variable-attrs uri URI of JSON file with attributes to associate to the variables in the NetCDF file. | ||
--audio-path-map-prefix from~to | ||
Prefix mapping to get actual audio uri to be used. Example: 's3://pacific-sound-256khz-2022~file:///PAM_Archive/2022'. | ||
--audio-path-prefix dir | ||
Ad hoc path prefix for sound file location, for example, /Volumes. By default, no prefix applied. | ||
--date YYYYMMDD The date to be processed. | ||
--voltage-multiplier value | ||
Applied on the loaded signal. | ||
--sensitivity-uri file | ||
URI of sensitivity NetCDF for calibration of result. Has precedence over --sensitivity-flat-value. | ||
--sensitivity-flat-value value | ||
Flat sensitivity value to be used for calibration. | ||
--output-dir dir Output directory | ||
--output-prefix prefix | ||
Output filename prefix | ||
--s3 s3 access is involved, possibly with required credentials. | ||
--s3-unsigned s3 access is involved, not requiring credentials. | ||
--download-dir dir Directory for any downloads (e.g., when s3 or gs is involved). | ||
--assume-downloaded-files | ||
If any destination file for a download exists, assume it was downloaded already. | ||
--retain-downloaded-files | ||
Do not remove any downloaded files after use. | ||
--max-segments num Test convenience: limit number of segments to process. By default, 0 (no limit). | ||
--subset-to lower upper | ||
Subset the resulting PSD to [lower, upper), in terms of central frequency. | ||
Examples: | ||
pbp-hmb-gen --json-base-dir=tests/json \ | ||
--audio-base-dir=tests/wav \ | ||
--date=20220902 \ | ||
--output-dir=output | ||
``` |
Oops, something went wrong.