Consider timeseries for building the surrogate model #108

SorooshMani-NOAA · 2023-08-10T15:54:17Z

Currently only max water elevation is used to train the surrogate model. We'd like to consider the whole timeseries to see how it affects the surrogate output.

Tasks:

Performance of subsetting perturbation timeseries dataset #129

@saeed-moghimi-noaa
@WPringle
@SorooshMani-NOAA

SorooshMani-NOAA · 2023-12-27T13:54:48Z

@FariborzDaneshvar-NOAA since you started exploring this item, can you please either link an existing ticket or just use this ticket to document your progress and impediments (like #128)

FariborzDaneshvar-NOAA · 2024-01-18T22:03:05Z

With the stacking suggestion in #129 (comment), I was able to execute the subset_dataset() function with stacked time&node! But the conversion of the KL surrogate model to the overall surrogate for each node step (execution of surrogate_from_karhunen_loeve() function) failed with MemoryError!

One suggestion was using a chunk of time steps. Here I will provide updates on that regard.

FariborzDaneshvar-NOAA · 2024-01-18T22:50:21Z

Building surrogate model for the first 100 time steps:

time_chunk = elev_timeseries.sel(time=slice("2018-08-30T13:00:00.000000000", "2018-09-03T16:00:00.000000000"))
time_chunk_stack = time_chunk.rename(
    nSCHISM_hgrid_node='node'
).stack(
    stacked=('time','node'), create_index=False
).swap_dims(
    stacked='node'
)
subset = subset_dataset(ds=time_chunk_stack, ...)

It went through and here are plots:

kL eigenvalues	KL fit

KL-surrogate fit	validation boxplots

sensitivities	model vs surrogate

This results look weird! and to me the KL fit didn't work correctly! One possibility is that the first 100 time steps used here are long before landfall and minimal variation might exist between them. It also reveals the issue in the plotting function I mentioned earlier here #132

Despite these results, I couldn't make percentile and probability plots due to MemoryError : Unable to allocate 1.15 TiB for an array with shape (15772912, 10000) and data type float64

FariborzDaneshvar-NOAA · 2024-01-18T23:53:32Z

I also tried opening subset.nc with dask (chunk=auto), but it didn't change the outcome of memory error (still getting the same message for percentile and probability plots!
But interestingly, the sensitivity plots for along-track were different! (see below) @SorooshMani-NOAA how that might be possible?!

SorooshMani-NOAA · 2024-01-19T15:01:06Z

@FariborzDaneshvar-NOAA about the memory issue, the problem is that in the function you showed me the other day it is calling numpy function directly, which means it will get all values to memory and then executes the function (as far as I understand). So you need to also change the function where the numpy method is called.

I'm not sure what is happening in the plots. Are you sure that mapping back to physical space is done correctly? Since we have a time-node dimension where neither times nor nodes are necessarily aligned, so we have to be very careful when reshaping.

I'm not sure if the plots we get are actually meaningful!

FariborzDaneshvar-NOAA · 2024-01-19T15:51:24Z

@SorooshMani-NOAA thanks for your comment, you brought up a good point about results! I didn't reshape it back to time/node, which might explain these plots, but it's not clear to me at which step it should be reshaped!

This new memory issue is different from what I mentioned before (for the numpy function in the surrogate expansion, when I used the entire time step), but you are right, it should be addressed separately.

SorooshMani-NOAA self-assigned this Aug 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider timeseries for building the surrogate model #108

Consider timeseries for building the surrogate model #108

SorooshMani-NOAA commented Aug 10, 2023 •

edited

Loading

SorooshMani-NOAA commented Dec 27, 2023

FariborzDaneshvar-NOAA commented Jan 18, 2024

FariborzDaneshvar-NOAA commented Jan 18, 2024

FariborzDaneshvar-NOAA commented Jan 18, 2024

SorooshMani-NOAA commented Jan 19, 2024 •

edited

Loading

FariborzDaneshvar-NOAA commented Jan 19, 2024

Consider timeseries for building the surrogate model #108

Consider timeseries for building the surrogate model #108

Comments

SorooshMani-NOAA commented Aug 10, 2023 • edited Loading

SorooshMani-NOAA commented Dec 27, 2023

FariborzDaneshvar-NOAA commented Jan 18, 2024

FariborzDaneshvar-NOAA commented Jan 18, 2024

FariborzDaneshvar-NOAA commented Jan 18, 2024

SorooshMani-NOAA commented Jan 19, 2024 • edited Loading

FariborzDaneshvar-NOAA commented Jan 19, 2024

SorooshMani-NOAA commented Aug 10, 2023 •

edited

Loading

SorooshMani-NOAA commented Jan 19, 2024 •

edited

Loading