-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JP-3708: Adding the CRMAG Array to the Optional Results Product #304
Conversation
A regression test for this is here: https://plwishmaster.stsci.edu:8081/job/RT/job/JWST-Developers-Pull-Requests/1780 |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #304 +/- ##
==========================================
+ Coverage 86.21% 86.34% +0.13%
==========================================
Files 47 49 +2
Lines 8812 8896 +84
==========================================
+ Hits 7597 7681 +84
Misses 1215 1215 ☔ View full report in Codecov by Sentry. |
crs->tail->flink = new_node; | ||
crs->tail = new_node; | ||
crs->size++; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kmacdonald-stsci I have not used linked lists in c before. I need a quick overview. I googled it and these double linked lists ? I could use a little more detail so I can figure out what it going on.
Is 'head' the pointer for the VERY FIRST CR segment. And 'tail' the location of the very last one. So are the segments beginning and ends stored or just the first and last locations ?
In the cr_list_add on line 1248 why is the tail = new_node. I did not think head and tail would equal each other. Is that just for the first cr found.
Also I am unclear that flink is supposed represent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The linked list I am using for this is a singly linked list. There are a few different ways to implement this. My implementation uses two structures, the linked list container and the nodes being listed, with only the forward link being defined in each node. The forward link is the flink
parameter that points to the next node in the list. When flink
is NULL
, then that node is the tail of the list:
struct cr_node { |
The linked list data structure contains useful meta data for the list. In particular, the head, the tail, and the total number of nodes in each list, which will be used to create the second dimension for the 4-D CRMAG array, which has dimensions (nints, max_num_crs, nrows, ncols)
. The tail parameter in the linked list is a convenience variable to go directly to the tail and add the next node, rather than have to traverse the list to the tail each time a new node is added to the list.
struct cr_list { |
So the linked list with n CRs detected looks like this:
cr_list->head = cr_node->flink = pointer to the next node
cr_node->flink = pointer to the next node
cr_node->flink = pointer to the next node
cr_node->flink = pointer to the next node
...
cr_list->tail = cr_node->flink = NULL
cr->size = n
So, if some ramp looks like this:
data = [1., 2., 13., 14., 15., 16., 37., 38., 39., 40., 41. 102., 103., 104., 105.]
groupdq = [GOOD, GOOD, JUMP_DET, GOOD, GOOD, GOOD, JUMP_DET, GOOD, GOOD, GOOD, GOOD, JUMP_DET, GOOD, GOOD, GOOD]
Then the list of CR magnitudes for this ramp will look like this:
cr_list->head = {crmag = 11., flink = pointer to next node}
{crmag = 21., flink = pointer to next node}
cr_list->tail = {crmag = 61., flink = NULL}
cr_list->size = 3
So, yes, cr_list->head
is the very first CR magnitude encountered in the ramp and is the difference between that group data and the previous group data. The cr_tail->tail
is the last known CR magnitude computed, so if another one is found, a new node is allocated and added as the tail. The result is an ordered linked list starting with the first detected CR magnitude and ending with the last detected CR magnitude.
At line
stcal/src/stcal/ramp_fitting/src/slope_fitter.c
Line 1246 in 0f3a23a
if (0==crs->size) { |
This conditional checks to see if a current list is empty. If it's empty, the size will be 0. When a CR is detected, the CR magnitude is computed and a new cr_node is allocated to store that value. The cr_node
is then added to the empty list, resulting in a list of size 1, therefore the head and tail are the same and the size is set to 1. This list will look like this:
cr_list->head = cr_list->tail = cr_node->flink = NULL
cr_list->size = 1
The "else" in this conditional means the list is not empty, so put the newly computed CR magnitude at tail of the list and the size is incremented by 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have gone through the code a few times. To the best of my ability it looks correct to me and frees the necessary memory when it is done. The comments you gave me were essential for me to understand the flow. If you can somehow add some of those details into the code so in the future when we need to look at the code it is easier to understand what is going on that might be helpful.
…p fitting C-extension.
0f3a23a
to
7140c7d
Compare
I added comments to the linked lists used to more thoroughly describe the implementations. Please take a look at them to let me know if they help clarify how they are implemented. |
I'm testing these changes on some sample data I have on disk, and this command, running without multiprocessing completes in seconds: This one, with multiprocessing on, I killed after 5 minutes because it didn't look like it was ever going to complete: This one, with multiprocessing on but optional output off completes in seconds: Can you please double check the interaction between multiprocessing and optional output results? I'm wondering if there's a deadlock somewhere. |
Closing to fix multiprocessing bug in the optional results product. |
Resolves JP-3708
This PR addresses a crash that occurs when running multiprocessing for ramp fitting with the C extension while choosing to save the optional results product. This is because the CRMAG array was not implemented in the C extension, so the C extension returns a
NoneType
, so when the data slices attempt to reassemble an exception is thrown because there is no CRMAG to reassemble.The CRMAG array is a 4-D array, with dimensions
(nints, max_num_crs, nrows, ncols)
. For each integration ramp there is some number of cosmic ray jumps, including 0. The dimensionmax_num_crs
is the maximum number of jumps found over all integrations and all pixels. For ramps that have fewer than the max number of jumps the last elements in that pixel ramp are 0. For example, if the maximum number of jumps found was 5, but a ramp has only two jumps, the first two elements of the CRMAG array for that integration, row, and column are non-zero, while the last three are 0. The non-zero elements of the ramp array are the magnitude of the jump, which is the difference between the two groups where a jump was detected.The CRMAG is now implemented in the C extension and functions properly with multiprocessing.
Tasks
docs/
pageno-changelog-entry-needed
)changes/
:echo "changed something" > changes/<PR#>.<changetype>.rst
(see below for change types)"git+https://github.com/<fork>/stcal@<branch>"
)jwst
regression testromancal
regression testnews fragment change types...
changes/<PR#>.apichange.rst
: change to public APIchanges/<PR#>.bugfix.rst
: fixes an issuechanges/<PR#>.general.rst
: infrastructure or miscellaneous change