Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problems with aligning channles for double ended calibration #59

Open
jmhuss opened this issue Jan 28, 2021 · 2 comments
Open

problems with aligning channles for double ended calibration #59

jmhuss opened this issue Jan 28, 2021 · 2 comments

Comments

@jmhuss
Copy link
Contributor

jmhuss commented Jan 28, 2021

Since the automated shift to align foreward and backward channel suggested largely varying shifts, I used the notebook 'prep_doubleended'.
I ran into some pitfalls executing the script:

It was unclear to me, that I could (and had to) just drop the ref_data keyword in assign_ref_data(), when I used no external temperature data. Might be solved by a short note.

The addition to drop one observation in one channel might come with a check, wether a (and which) channel has more observations than the other, apllying the correction only if needed and hence avoiding a tripping wire when time dimesnions are already of equal length.
ds_ch1 = ds_ch1.isel(time=slice(None, -1))

Highlighting the argument shift_lims in suggest_cable_shift_double_ended() as something that has to be expanded when no good match is achieved might point out, that only a local maximum of correlation was reached with the default limits and hence save some troubleshooting when the lag is large (>60 bins or 15 meters in my case).
Might too narrow shift limits also be the reason, why the automated function when executing PyFOX gave wrong values?

@klapo
Copy link
Owner

klapo commented Feb 1, 2021

Thanks for raising these issues. A best practice is to only raise one issue per, well, issue.

Ok, I am going to parse this into a couple issues and raise those separately, so we can track each individually. Let me know if I dissected this correctly.

Issue 1: The non-existent documentation on the assign_ref_data() led to confusion. At a bare minimum a function description should be provided.

Issue 2: The new prep_doubleended example notebook needs to explain what the assign_ref_data() step is doing, so that new users can adopt the example to their own use cases.

Issue 3: I am not quite sure I follow the discussion regarding the dropping of channels. The pyfocs logic attempts to always pair the "forward" channel with a "backwards" channel. That means that sometimes the pyfocs logic drops an observation in some edge cases. This logic was designed to never double use an observation and to handle occasions when observations may span between archived time intervals when processing long blocks of data. Does this logic not work? If so, I need you to raise a separate issue demonstrating the issue.

Issue 4: The example notebook makes this point, no? The automation script is not meant to run without providing the fixed_shift keyword, anyway. As a result, I think the best course of action is to raise an error if a user tries to do this.

@jmhuss
Copy link
Contributor Author

jmhuss commented Feb 2, 2021

Thanks for the hint, I'll seperate issues next time.
Your first two issues cover what I was intending to communicate :)

Issue 3: Within the section 1.2 Further Data preparation: Prune to similar time period the notebook prep_doubleended has the following code:

# Channel 1 has one more observation, so we drop the last one to be 
# consistent with the decision making in pyfocs
ds_ch1 = ds_ch1.isel(time=slice(None, -1))

As far is I understand, this snippet referrs to a specific data set and breaks the process for data sets where both channels already come with the same amount of observations (or channel 2 is longer).
At a minimum, the description could be 'If a channel has one more observation, you should drop the last one...', so one is aware to check whether this line should be executed or not.
Alternatively I would suggest to add an if-query, checking if a channel (and which) is longer and act accordingly - something like:

if len(ds_ch1.time) - len(ds_ch2.time) == 1:
    ds_ch1 = ds_ch1.isel(time=slice(None, -1))
elif len(ds_ch1.time) - len(ds_ch2.time) == -1:
    ds_ch2 = ds_ch2.isel(time=slice(None, -1))
elif not len(ds_ch1.time) - len(ds_ch2.time) == 0:
    raise NameError('Length of channels differs by more than 1.')

Issue 4: I am referring to the notebook section 2. Align the Forwards and Backwards channels.. Here the variable shift_window is defined, determining, how much of a lag range shall be tested to find the best match.
It was set to shift_window = 50. In my case, the actual shift was 62 bins and hence was only detected when expanding the shift_window parameter to a larger value.
A documentation should be sufficient in my eyes, so that a new user is aware of this constraint.
About the automation script outside the notebook:
Before beeing aware of the notebook I ran the calibration on a subset without fixed_shift to see, if the console output suggests a consistent shift that I could then implement as a fixed_shift for recalibration.
So yes: if that's not the way you want users to go, an error message (maybe pointing to the notebook?) should be printed instead an automatically applied shift.

Hope I could make my points a little more clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants