Moving forward, some notes #4

khufkens · 2024-02-28T09:14:06Z

The premise of the manuscript would be that the physiological basis of fLUE can be used as a target for a new drought index through machine learning. Two caveats remain.

potential data leakage and circularity in the use of the data.

i.e. MODIS is used as input in both analysis (fLUE, this work)
this has been addressed by showing sensitivity to fLUE by Landsat data, therefore decoupling the calculated index and the input data (while keeping all other things static). It must be noted that the model structure does change when using landsat data.

how does this more complex model compare to known indices? The idea in this work has been that a ML model trained on fLUE should outperform existing indices when it comes to its relation to fLUE. Is this the case?

it seems that the model generally outperforms the bulk of the indices, but there are exceptions. One needs to check if the same indices return on the top of the list across clusters and sites. The parsimonious solution of the ML model might be a benefit in comparison to a tailored site / vegetation specific index.

Point 1. has been answered through the use of landsat data with results which hold up. Number 2. has been proven by a cross comparison to a zoo of indices - but needs nuances wrt to the indices.

See vignettes
https://geco-bern.github.io/index_based_drought_monitoring/

A third caveat remains, but is part of any simple index, mainly the fact that this metric/model is diagnostic only (calculated for each time step) and not prognostic.

khufkens · 2024-02-28T12:51:12Z

A check of the VI which rank better or as good as ML do not show consistency across clusters. ML is the parsimonious way of dealing with large scale drought assessments it seems.

khufkens · 2024-02-28T13:32:24Z

@stineb

No consistent VI tops the ML indices.
https://geco-bern.github.io/index_based_drought_monitoring/articles/model_evaluation_VI.html

Things to consider is to further simplify the model to avoid overfitting. Currently only limited hyperparameters are tuned, but this can be expanded to prune the trees severely (to decrease model sizes).

khufkens · 2024-02-28T13:34:59Z

By and large I would consider this done, conceptually. There is a need to clean things up, tighten things up a bit more - write code for nicer graphs and output all relevant stats. However, this should be a solid basis for a small analysis / manuscript (addressing the most pressing methodological issues through cross validation, independent datasets and illustrate impact through a scaling exercise).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Moving forward, some notes #4

Moving forward, some notes #4

khufkens commented Feb 28, 2024 •

edited

Loading

khufkens commented Feb 28, 2024

khufkens commented Feb 28, 2024

khufkens commented Feb 28, 2024

Moving forward, some notes #4

Moving forward, some notes #4

Comments

khufkens commented Feb 28, 2024 • edited Loading

khufkens commented Feb 28, 2024

khufkens commented Feb 28, 2024

khufkens commented Feb 28, 2024

khufkens commented Feb 28, 2024 •

edited

Loading