-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.qmd
89 lines (61 loc) · 3.66 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
title: ""
author:
- name: Piotr Tompalski
email: [email protected]
format: html
engine: knitr
lightbox: true
date: 08/01/2024
date-modified: today
---
```{r, echo = FALSE, warning=FALSE, message=FALSE}
options(digits = 2)
knitr::opts_chunk$set(
collapse = TRUE,
fig.path = "graphics/",
fig.width = 7,
fig.height = 5,
comment = "#>",
dev = "svg",
dpi = 300
)
```
# LidarSpeedTests
::: callout-warning
This is an ongoing work and the results may contain errors.
:::
This repository is dedicated to evaluating the performance of various point cloud processing tools in handling typical airborne laser scanning (ALS) data processing tasks. The goal is to provide users with comparative insights into processing time, CPU, RAM, and disk usage across different tools and configurations. More importantly, benchmark results could help decide which processing parameters are most optimal given the dataset and workstation characteristics.
## Project Overview
Point cloud processing tools offer a range of functionalities, often with overlapping capabilities. Users face decisions about which tool to use and how to configure settings for optimal performance. This project benchmarks several tools to guide these decisions.
## Benchmarking Methodology
- **Uniform Task Performance**: Each tool is tested against standardized tasks to ensure comparable results.
- **Resource Utilization**: We measure and record processing time, CPU and RAM usage.
- **Iterative Testing**: Tools are tested under various conditions to assess performance impacts:
- Different numbers of workers (CPU cores/threads)
- Various storage devices including local (SSD, HDD) and network drives
- Diverse input data characteristics
## Input Data
The benchmark uses a subset of ALS data acquired over the Petawawa research forest located in in Ontario, Canada. The dataset is divided into 1 x 1 km tiles (100 tiles in total). This dataset is freely available and can be downloaded from https://opendata.nfis.org/mapserver/PRF.html.
![](graphics/data.png)
The average point density of the dataset is 13.6 pts/m². To facilitate a comprehensive analysis, the original dataset was systematically reduced to create variants with different densities: 1, 2, 5, 10. To simulate dataset with higher point density, the data was artificially densified by duplicating and then introducing a random noise to a subset of points. This procedure was used to generate datasets with densities of 20, and 50 pts/m². Benchmarks were conducted across these modified datasets to assess performance at varying point densities.
![](graphics/data_density_crossections.png)
## Included Software Packages
So far the benchmark focuses on lidar processing tools available in R (`lidR`, `lasR`), as well as `lastools`.
- `lidR`: [github.com/r-lidar/lidR](https://github.com/r-lidar/lidR)
- `lasR`: [github.com/r-lidar/lasR](https://github.com/r-lidar/lasR)
- `lastools`: [rapidlasso.de/product-overview](https://rapidlasso.de/product-overview/)
## Benchmark Tasks
Tasks are categorized into 'Simple' and 'Complex' to differentiate between single-step operations and multi-step processing pipelines, respectively.
### Simple Tasks
- Ground classification
- Digital Terrain Model (DTM) generation
- Canaopy Height model (CHM) generation
- Height normalization
- Point cloud metrics calculation (two scenarios: single metric, multiple metrics)
### Complex Tasks (in progress)
- a workflow consisting of the following tasks: Generate DTM, normalize heights, generate CHM
- ...
### Additional benchmarks:
- Influence of tile size on processing time
- Influence of drive configuration on processing time