The Eobs_Data_Reader
function is an R-based tool designed to automate the pre-processing of Acceleration, Magnetometry, and Orientation data collected from Eobs devices. Leveraging data.table
coding, this function accelerates the extraction and processing of large datasets, specifically designed to handle individual animal datasets downloaded directly from Movebank.
[This function is currently in development. Your feedback and testing are highly appreciated! If you encounter any issues or have suggestions for improvement, please feel free to share them.]
- Subset Data: Enables users to subset the dataset to a specified timestamp interval.
- Data Conversion: Converts row-wise, space-separated sensor readings (Acceleration, Magnetometry, Quaternions) into long format, where each row represents a single axis/component reading.
- Dynamic Handling of Sampling Rates:
- Provides an option to standardize acceleration data to a user-specified sampling frequency (Hz), ensuring consistency when multiple sampling rates exist in the dataset.
- Handles cases where fewer than three acceleration axes are recorded.
- Burst Duration Standardization:
- Option to standardize ACCELERATION burst durations across sensor types (Legacy and IMU ACC sensor) via the new
standardise_burst_duration
parameter. - If 'standardise_burst_duration' is set to TRUE, the function Automatically identifies the smallest typical acceleration burst duration representative of recording behavior (excluding anomalies or outliers) by:
- Calculating statistics to identify burst duration variability.
- Filtering durations using Interquartile Range (IQR) to exclude extreme outliers.
- Selecting the smallest typical median burst duration for standardization.
- Subdivides bursts into uniform durations and adds additional standardized metrics.
- Option to standardize ACCELERATION burst durations across sensor types (Legacy and IMU ACC sensor) via the new
- Timestamp Interpolation: Automatically interpolates timestamps within data bursts, ensuring uniformly spaced time intervals.
-
Automatically recognizes the sensor type (Legacy or IMU Accelerometer) and applies the appropriate transformation to convert raw acceleration data into standardized units of g (1 g ≈ 9.81 m/s²).
-
Provides columns and visualizations for:
- Acceleration burst durations and intervals over time.
- Sensor type classification (Legacy vs IMU).
- Burst IDs for continuous acceleration recordings (but see also 'standardise_burst_duration').
-
Computes per-burst acceleration metrics (across burst_ids and standardized_burst_id if 'standardise_burst_duration' is set to TRUE):
- Static & Dynamic Acceleration: Calculated using user-defined rolling averages.
- VeDBA: Computes Vectorial Dynamic Body Acceleration for each burst, accounting for continuous sampling.
- Converts quaternions (if 20 Hz Orientation data is present) into mathematically meaningful normalised components (W, X, Y, Z) for downstream analysis.
- Processes magnetometry data (if present) into long format.
- Validates required columns for each data type, skipping sections with missing data while providing clear console messages.
- Flags potential duplicate timestamps in a dedicated
duplicate_times
column. - Includes progress bars and detailed console messages to keep users informed about the current processing step.
The output dynamically adjusts based on available data types:
- Acceleration Data Only: Returns a data frame named
"Acceleration Data"
. - All Data Types Available:
- If Magnetometry and Quaternion Data timestamps match, these are combined into a data frame named
"Orientation Data"
and returned alongside"Acceleration Data"
. - If timestamps do not match, a list is returned with relevant names:
"Magnetometry Data"
,"Quaternion Data"
, and"Acceleration Data"
.
- If Magnetometry and Quaternion Data timestamps match, these are combined into a data frame named
- Magnetometry and Quaternion Data Only:
- If timestamps match, the datasets are combined into a single data frame named
"Orientation Data"
. - If timestamps do not match, a list with
"Magnetometry Data"
and"Quaternion Data"
is returned.
- If timestamps match, the datasets are combined into a single data frame named
- Combination of Acceleration Data with Magnetometry or Quaternion Data: Returns a list of the available datasets with appropriate names.
-
Download Data from Movebank:
- Ensure the dataset includes relevant columns for Acceleration, Magnetometry, and/or Orientation data.
-
Load Data in R:
- This must be read in using read.csv() and NOT
fread()
df <- read.csv("example_movebank_data.csv")
- This must be read in using read.csv() and NOT
-
Run the Function:
processed_data <- Eobs_Data_Reader( data = df, rolling_mean_width = 40, # If 20 Hz frequency, this corresponds to 2 s running mean standardised_freq_rate = 20, standardised_burst_duration = TRUE, start_timestamp = "2024-05-01 00:00:00", end_timestamp = "2024-06-01 23:59:59", plot = TRUE )
The function relies on the following R packages. If any are not installed, the function will automatically install them before proceeding.
- data.table
- pbapply
- tidyr
- dplyr
- ggplot2
- cowplot
- viridis
- knitr
Primary Metadata Columns:
burst_id
: Unique identifier for each burst, based on time and sensor continuity.row_id
: Row index from the original dataset, used to map back to the input data.timestamp
: Original timestamp for each sample, as provided in the raw dataset.individual.taxon.canonical.name
: Scientific name of the tracked species.tag.local.identifier
: Identifier for the tracking tag.individual.local.identifier
: Identifier for the tracked individual.
Sensor-Specific Metadata:eobs.acceleration.axes
: Axes recorded for acceleration (e.g., XYZ or subsets like XZ).eobs.acceleration.sampling.frequency.per.axis
: Sampling rate (in Hz) for acceleration data. If standardized_freq_rate is set, this column will reflect that value.
Processed Acceleration Data:acc_x
,acc_y
,acc_z
: Raw (digital analogue) acceleration values for the respective axes after long-format conversion.
Timestamps and Durations:interpolated_timestamp
: Interpolated timestamps within bursts for uniform spacing.time_diff
: Time difference (in seconds) between consecutive samples.duplicate_times
: Boolean flag indicating duplicate interpolated_timestamp values.sensor_sequence
: Identifier for uninterrupted sequences of the same sensor type.sampling_interval
: Time interval (in seconds) between consecutive samples within uninterrupted sequences.burst_duration
: Total duration of each burst (in seconds).
Standardized Burst Data (ifstandardise_burst_duration = TRUE
):standardized_burst_id
: Unique identifier for standardized bursts, subdividing original bursts into smaller, uniform durations.standardized_burst_duration
: Duration (in seconds) of each standardized burst, typically equal to the standardized duration unless it's a tail-end segment.is_standardized
: Boolean flag indicating whether the segment matches the standardized duration (TRUE) or is shorter (e.g., tail-end bursts).
Transformed Acceleration Data:acc_x_g
,acc_y_g
,acc_z_g
: Acceleration values (raw) converted to g units, where 1 g ≈ 9.81 m/s².
Static and Dynamic Acceleration (per burst_id)acc_x_static
,acc_y_static
,acc_z_static
: Rolling mean (static component) of acceleration for each axis.acc_x_dynamic
,acc_y_dynamic
,acc_z_dynamic
: Dynamic component of acceleration (raw - static) for each axis.VeDBA
: Vectorial Dynamic Body Acceleration, computed as the Euclidean norm of dynamic acceleration values (indicative of movement intensity).
Standardized Metrics (if standardised_burst_duration = TRUE)standardized_acc_x_static
,standardized_acc_y_static
,standardized_acc_z_static
: Static acceleration components for standardized bursts.standardized_acc_x_dynamic
,standardized_acc_y_dynamic
,standardized_acc_z_dynamic
: Dynamic acceleration components for standardized bursts.standardized_VeDBA
: VeDBA calculated for each standardized burst.
row_id
: Row index from the input data.timestamp
: Original timestamps.individual.taxon.canonical.name
: Taxonomic name of the species associated with the data.tag.local.identifier
: Identifier for the individual tag used to collect the data, typically unique within a dataset.individual.local.identifier
: Identifier for the individual animal associated with the tag, used to distinguish between animals in the dataset.eobs.magnetometery.axes
: Indicates the axes (X, Y, Z) recorded for magnetometry data. Always set to "XYZ" in this dataset.mag.magnetic.field.sampling.frequency.per.axis
: Sampling frequency (in Hz) of the magnetometry sensor, indicating how frequently data points are recorded per axis.mag_x
,mag_y
,mag_z
: Magnetic field data (XYZ axes).burst_length
: Number of data points within a burst, representing the duration of a continuous recording session.interpolated_timestamp
: Interpolated timestamps within bursts for uniform spacing.time_diff
: Time difference (in seconds) between consecutive samples.duplicate_times
: Boolean flag indicating duplicate interpolated_timestamp values.burst_id
: Unique identifier for each burst, based on time and sensor continuity.quat_w
,quat_x
,quat_y
,quat_z
: Quaternion data components, normalized.
This project is licensed under the MIT License.
For questions, bug reports, suggestions, or contributions, please contact:
- Richard Gunner
- Email: [email protected]
- GitHub: Richard6195