Add `sum_across_studies` to kda #859

adelavega · 2024-01-09T04:02:29Z

sum_across_studies returns an array where study counts are sequentially added to a single dense matrix, rather than compiling an exhaustive matrix containg all study-level count data.

This is much more effecient memory and compute wise, enabling MKDA Chi squared with Neurosynth as reference, using <1GB of memory.

jdkent · 2024-01-09T04:46:56Z

oh shoot, do we actually not need the individual matrices for fwe correction? Looking over the code it seems like we don't, and the first thing done for mkdachi2 is summing across all the studies.

adelavega · 2024-01-09T04:46:59Z

Allright guys @JulioAPeraza @jdkent @tsalo, this is realy exciting.

For MKDAChi2 we don't actually need study level modeled activiation. This realization means we can save massive amounts of memory and substantial time (2-3x speed up over an already optimized PR I was working on).

When looping over studies, instead of saving all study level (M)KDA modeled activation (MA) maps, we can sum them in-place into a single MNI sized dense volume.

This means no need to run np.unique, which is slow, and that the data is O(1) with number of studies.

I initally enable this for all MKDAKernels, but it seems to result in incorrect statisics for MKDADensity. We could potentially implement this there as well but need to look at stats more careful. We might be able to save intermediary stats to get correct results.

adelavega · 2024-01-09T04:49:52Z

On a measly Macbook Air with 8GB of RAM, I was able to run MKDAChi2 comparing the example pain dataset to all of Neurosynth in 32 seconds. Only need a little over 1 GB of RAM.

Someone check my logic in case I'm missing something!

nimare/meta/utils.py

Co-authored-by: James Kent <[email protected]>

adelavega · 2024-01-09T05:23:28Z

I was able to simplify the code further by just returning a dense matrix. Saves ~2 more seconds.

I'm open to other stylistic suggestions though.

jdkent · 2024-01-09T05:24:26Z

nimare/meta/utils.py

+            all_values += study_values
+
+        # Set voxel outside the mask to zero.
+        all_values[~mask_data] = 0


and this step is functionally replacing this

sphere_idx_inside_mask = np.where(mask_data[tuple(all_spheres.T)])[0] all_spheres = all_spheres[sphere_idx_inside_mask, :]

nimare/meta/utils.py

Co-authored-by: James Kent <[email protected]>

jdkent · 2024-01-09T06:02:01Z

nimare/meta/cbma/mkda.py


-        del ma_maps1
+        n_selected = self.dataset1.coordinates["id"].unique().shape[0]


probably should have _collect_ma_maps return what the n_selected value is, all the coordinates for a particular experiment could exist outside the mask and the experiment would not be included.

jdkent

this line:

https://github.com/neurostuff/NiMARE/pull/859/files#diff-953de5154c5b38a0a853444e8525f6d8edc0eef147cb69c54130eb389fc92cf5L187

need to add dtype to image creation:

img = nib.Nifti1Image(kernel_data, mask.affine, dtype=kernel_data.dtype)

this should fix some of the failing tests

adelavega · 2024-01-09T22:56:29Z

Your changes look good to me, James.

jdkent · 2024-01-09T23:13:38Z

fantastic work! merging it in.

@Profile

* Simplify stacking * Fix typo * Remove vstack * Fix stacking * Remove @Profile * Use jit for _convolve_sphere * switch arrays to int32 where possible * reduce memory consumption * fix style * Simplify numba * Run black * �Mask outside space in numba * Add indicator later * Set value to input * Add `sum_across_studies` to kda (#859) * Resolve merge * Add sum aross studies * Remove @Profile * Only enable sum across studies for MKDA Chi Squared * Run black * Return dense for MKDACHiSquared * Update nimare/meta/utils.py Co-authored-by: James Kent <[email protected]> * Run black * Update nimare/meta/utils.py Co-authored-by: James Kent <[email protected]> * Format suggestion * change how number of studies and active voxels are found * add explicit dtype when creating image * make the comment clearer * add the kernel argument to the dictionary * bump minimum required versions * alternative way to sum across studies in a general way * fix arguments and style * pin minimum seaborn version --------- Co-authored-by: Alejandro de la Vega <[email protected]> Co-authored-by: James Kent <[email protected]> * manage minimum dependencies * Index within numba * Only allow sum_overlap if not sum_across_studies * Add unique index back * Remove @Profile * run black * ensure the methods for creating the kernel are equivalent --------- Co-authored-by: James Kent <[email protected]> Co-authored-by: Alejandro de la Vega <[email protected]>

Alejandro de la Vega and others added 5 commits January 8, 2024 18:15

Resolve merge

2bd09d1

Add sum aross studies

0119a65

Remove @Profile

2bce93e

Only enable sum across studies for MKDA Chi Squared

5750fc6

Run black

70eecfa

jdkent reviewed Jan 9, 2024

View reviewed changes

nimare/meta/utils.py Outdated Show resolved Hide resolved

adelavega and others added 2 commits January 8, 2024 23:20

Return dense for MKDACHiSquared

0308770

Update nimare/meta/utils.py

8e97c15

Co-authored-by: James Kent <[email protected]>

jdkent reviewed Jan 9, 2024

View reviewed changes

nimare/meta/utils.py Show resolved Hide resolved

adelavega and others added 3 commits January 8, 2024 23:27

Run black

d661479

Update nimare/meta/utils.py

2a26f09

Co-authored-by: James Kent <[email protected]>

Format suggestion

8ee2c48

jdkent reviewed Jan 9, 2024

View reviewed changes

jdkent added 7 commits January 9, 2024 14:20

change how number of studies and active voxels are found

a412049

add explicit dtype when creating image

434f848

make the comment clearer

567d063

add the kernel argument to the dictionary

8e0cb17

bump minimum required versions

7c8a5f5

alternative way to sum across studies in a general way

f38e1ae

fix arguments and style

0403cf8

pin minimum seaborn version

60826ff

jdkent merged commit a4f434e into optimize_kda Jan 9, 2024
15 of 17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `sum_across_studies` to kda #859

Add `sum_across_studies` to kda #859

adelavega commented Jan 9, 2024 •

edited

Loading

jdkent commented Jan 9, 2024

adelavega commented Jan 9, 2024

adelavega commented Jan 9, 2024

adelavega commented Jan 9, 2024

jdkent Jan 9, 2024

jdkent Jan 9, 2024

jdkent left a comment

adelavega commented Jan 9, 2024

jdkent commented Jan 9, 2024


		del ma_maps1
		n_selected = self.dataset1.coordinates["id"].unique().shape[0]

Add sum_across_studies to kda #859

Add sum_across_studies to kda #859

Conversation

adelavega commented Jan 9, 2024 • edited Loading

jdkent commented Jan 9, 2024

adelavega commented Jan 9, 2024

adelavega commented Jan 9, 2024

adelavega commented Jan 9, 2024

jdkent Jan 9, 2024

Choose a reason for hiding this comment

jdkent Jan 9, 2024

Choose a reason for hiding this comment

jdkent left a comment

Choose a reason for hiding this comment

adelavega commented Jan 9, 2024

jdkent commented Jan 9, 2024

Add `sum_across_studies` to kda #859

Add `sum_across_studies` to kda #859

adelavega commented Jan 9, 2024 •

edited

Loading