JP-3708: Adding the CRMAG Array to the Optional Results Product #304

kmacdonald-stsci · 2024-10-10T12:20:41Z

This PR addresses a crash that occurs when running multiprocessing for ramp fitting with the C extension while choosing to save the optional results product. This is because the CRMAG array was not implemented in the C extension, so the C extension returns a NoneType, so when the data slices attempt to reassemble an exception is thrown because there is no CRMAG to reassemble.

The CRMAG array is a 4-D array, with dimensions (nints, max_num_crs, nrows, ncols). For each integration ramp there is some number of cosmic ray jumps, including 0. The dimension max_num_crs is the maximum number of jumps found over all integrations and all pixels. For ramps that have fewer than the max number of jumps the last elements in that pixel ramp are 0. For example, if the maximum number of jumps found was 5, but a ramp has only two jumps, the first two elements of the CRMAG array for that integration, row, and column are non-zero, while the last three are 0. The non-zero elements of the ramp array are the magnitude of the jump, which is the difference between the two groups where a jump was detected.

The CRMAG is now implemented in the C extension and functions properly with multiprocessing.

Tasks

update or add relevant tests
update relevant docstrings and / or docs/ page
Does this PR change any API used downstream? (if not, label with no-changelog-entry-needed)
- write news fragment(s) in changes/: echo "changed something" > changes/<PR#>.<changetype>.rst (see below for change types)
- run regression tests with this branch installed ("git+https://github.com/<fork>/stcal@<branch>")
  - jwst regression test
  - romancal regression test

news fragment change types...

changes/<PR#>.apichange.rst: change to public API
changes/<PR#>.bugfix.rst: fixes an issue
changes/<PR#>.general.rst: infrastructure or miscellaneous change

kmacdonald-stsci · 2024-10-10T12:20:58Z

A regression test for this is here:

https://plwishmaster.stsci.edu:8081/job/RT/job/JWST-Developers-Pull-Requests/1780

codecov · 2024-10-10T12:28:50Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.34%. Comparing base (60bd3b8) to head (38bad7d).
Report is 6 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #304      +/-   ##
==========================================
+ Coverage   86.21%   86.34%   +0.13%     
==========================================
  Files          47       49       +2     
  Lines        8812     8896      +84     
==========================================
+ Hits         7597     7681      +84     
  Misses       1215     1215

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jemorrison · 2024-10-23T18:31:02Z

src/stcal/ramp_fitting/src/slope_fitter.c

+        crs->tail->flink = new_node;
+        crs->tail = new_node;
+        crs->size++;
+    }


@kmacdonald-stsci I have not used linked lists in c before. I need a quick overview. I googled it and these double linked lists ? I could use a little more detail so I can figure out what it going on.
Is 'head' the pointer for the VERY FIRST CR segment. And 'tail' the location of the very last one. So are the segments beginning and ends stored or just the first and last locations ?
In the cr_list_add on line 1248 why is the tail = new_node. I did not think head and tail would equal each other. Is that just for the first cr found.
Also I am unclear that flink is supposed represent.

The linked list I am using for this is a singly linked list. There are a few different ways to implement this. My implementation uses two structures, the linked list container and the nodes being listed, with only the forward link being defined in each node. The forward link is the flink parameter that points to the next node in the list. When flink is NULL, then that node is the tail of the list:

stcal/src/stcal/ramp_fitting/src/slope_fitter.c

Line 283 in 0f3a23a

struct cr_node {

The linked list data structure contains useful meta data for the list. In particular, the head, the tail, and the total number of nodes in each list, which will be used to create the second dimension for the 4-D CRMAG array, which has dimensions (nints, max_num_crs, nrows, ncols). The tail parameter in the linked list is a convenience variable to go directly to the tail and add the next node, rather than have to traverse the list to the tail each time a new node is added to the list.

stcal/src/stcal/ramp_fitting/src/slope_fitter.c

Line 289 in 0f3a23a

struct cr_list {

So the linked list with n CRs detected looks like this:

cr_list->head = cr_node->flink = pointer to the next node cr_node->flink = pointer to the next node cr_node->flink = pointer to the next node cr_node->flink = pointer to the next node ... cr_list->tail = cr_node->flink = NULL cr->size = n

So, if some ramp looks like this:

data = [1., 2., 13., 14., 15., 16., 37., 38., 39., 40., 41. 102., 103., 104., 105.] groupdq = [GOOD, GOOD, JUMP_DET, GOOD, GOOD, GOOD, JUMP_DET, GOOD, GOOD, GOOD, GOOD, JUMP_DET, GOOD, GOOD, GOOD]

Then the list of CR magnitudes for this ramp will look like this:

cr_list->head = {crmag = 11., flink = pointer to next node} {crmag = 21., flink = pointer to next node} cr_list->tail = {crmag = 61., flink = NULL} cr_list->size = 3

So, yes, cr_list->head is the very first CR magnitude encountered in the ramp and is the difference between that group data and the previous group data. The cr_tail->tail is the last known CR magnitude computed, so if another one is found, a new node is allocated and added as the tail. The result is an ordered linked list starting with the first detected CR magnitude and ending with the last detected CR magnitude.

At line

stcal/src/stcal/ramp_fitting/src/slope_fitter.c

Line 1246 in 0f3a23a

if (0==crs->size) {

This conditional checks to see if a current list is empty. If it's empty, the size will be 0. When a CR is detected, the CR magnitude is computed and a new cr_node is allocated to store that value. The cr_node is then added to the empty list, resulting in a list of size 1, therefore the head and tail are the same and the size is set to 1. This list will look like this:

cr_list->head = cr_list->tail = cr_node->flink = NULL cr_list->size = 1

The "else" in this conditional means the list is not empty, so put the newly computed CR magnitude at tail of the list and the size is incremented by 1.

jemorrison

I have gone through the code a few times. To the best of my ability it looks correct to me and frees the necessary memory when it is done. The comments you gave me were essential for me to understand the flow. If you can somehow add some of those details into the code so in the future when we need to look at the code it is easier to understand what is going on that might be helpful.

…p fitting C-extension.

kmacdonald-stsci · 2024-10-24T23:39:55Z

I have gone through the code a few times. To the best of my ability it looks correct to me and frees the necessary memory when it is done. The comments you gave me were essential for me to understand the flow. If you can somehow add some of those details into the code so in the future when we need to look at the code it is easier to understand what is going on that might be helpful.

I added comments to the linked lists used to more thoroughly describe the implementations. Please take a look at them to let me know if they help clarify how they are implemented.

melanieclarke · 2024-10-29T15:44:49Z

I'm testing these changes on some sample data I have on disk, and this command, running without multiprocessing completes in seconds:
strun ramp_fit jw01247667001_02101_00001_mirifulong_jump.fits --save_opt=True --opt_name=opt_serial.fits

This one, with multiprocessing on, I killed after 5 minutes because it didn't look like it was ever going to complete:
strun ramp_fit jw01247667001_02101_00001_mirifulong_jump.fits --save_opt=True --opt_name=opt_parallel.fits --maximum_cores=half
After killing it, it reports 20 leaked semaphores.

This one, with multiprocessing on but optional output off completes in seconds:
strun ramp_fit jw01247667001_02101_00001_mirifulong_jump.fits --maximum_cores=half
After completion, it reports 10 leaked semaphores.

Can you please double check the interaction between multiprocessing and optional output results? I'm wondering if there's a deadlock somewhere.

kmacdonald-stsci · 2024-10-31T23:37:19Z

Closing to fix multiprocessing bug in the optional results product.

kmacdonald-stsci requested review from melanieclarke, tapastro, penaguerrero and stscirij October 10, 2024 12:20

kmacdonald-stsci requested a review from a team as a code owner October 10, 2024 12:20

github-actions bot added ramp_fitting testing labels Oct 10, 2024

jemorrison self-requested a review October 23, 2024 18:08

jemorrison reviewed Oct 23, 2024

View reviewed changes

jemorrison approved these changes Oct 24, 2024

View reviewed changes

kmacdonald-stsci added 6 commits October 24, 2024 19:19

Added initial CR magnitude computation.

935dcab

Creating the crmag portion of the optional results product in the ram…

7ec53ff

…p fitting C-extension.

Creating array with initialization to zeros.

6f28c45

Adding test for the crmag element in the optional results product.

7997ec6

Adding change log fragment.

4ab1743

Adding comment to the CRMAG test.

7140c7d

kmacdonald-stsci force-pushed the jp_3708 branch from 0f3a23a to 7140c7d Compare October 24, 2024 23:19

Adding comments to clarify the implementation of the linked lists.

38bad7d

kmacdonald-stsci closed this Oct 31, 2024

melanieclarke mentioned this pull request Nov 15, 2024

JP-3708, JP-3771, JP-3791 #318

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JP-3708: Adding the CRMAG Array to the Optional Results Product #304

JP-3708: Adding the CRMAG Array to the Optional Results Product #304

kmacdonald-stsci commented Oct 10, 2024 •

edited

Loading

kmacdonald-stsci commented Oct 10, 2024 •

edited

Loading

codecov bot commented Oct 10, 2024 •

edited

Loading

jemorrison Oct 23, 2024

kmacdonald-stsci Oct 23, 2024 •

edited

Loading

jemorrison left a comment

kmacdonald-stsci commented Oct 24, 2024

melanieclarke commented Oct 29, 2024

kmacdonald-stsci commented Oct 31, 2024

JP-3708: Adding the CRMAG Array to the Optional Results Product #304

JP-3708: Adding the CRMAG Array to the Optional Results Product #304

Conversation

kmacdonald-stsci commented Oct 10, 2024 • edited Loading

Tasks

kmacdonald-stsci commented Oct 10, 2024 • edited Loading

codecov bot commented Oct 10, 2024 • edited Loading

Codecov Report

jemorrison Oct 23, 2024

Choose a reason for hiding this comment

kmacdonald-stsci Oct 23, 2024 • edited Loading

Choose a reason for hiding this comment

jemorrison left a comment

Choose a reason for hiding this comment

kmacdonald-stsci commented Oct 24, 2024

melanieclarke commented Oct 29, 2024

kmacdonald-stsci commented Oct 31, 2024

kmacdonald-stsci commented Oct 10, 2024 •

edited

Loading

kmacdonald-stsci commented Oct 10, 2024 •

edited

Loading

codecov bot commented Oct 10, 2024 •

edited

Loading

kmacdonald-stsci Oct 23, 2024 •

edited

Loading