Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of XGM data recording the wrong number of pulses #161

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

JamesWrigley
Copy link
Member

In particular, by:

  • Explicitly checking for a wrong number of pulses by comparing fast/slow data and emitting a warning if they differ.
  • Add more obvious warnings to the docs.

Follow up to #153.

@JamesWrigley JamesWrigley added the bug Something isn't working label Apr 16, 2024
@JamesWrigley JamesWrigley self-assigned this Apr 16, 2024
for information on retrieving the true number of pulses.
Warning:
This can be unreliable, see the docs for
[XGM.npulses()][extra.components.XGM.npulses] for information on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For someone in a hurry to get something done, and maybe not a native English speaker, I think it would be easy to read this as 'this method is bad, npulses() is better'.

I think it might be worth repeating the pointer to XrayPulses in each of these warnings, even if that's a bit more verbose, rather than pointing people to the docs of another method to find that information.

Comment on lines +431 to +484
if not counts_match:
warn(f"Slow data pulse counts ({key}) don't match the counts from fast data (data.intensityTD), data may be invalid!")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we know the slow data counts aren't always reliable and the difference from the fast data is a useful signal of that, does that imply the fast data are more reliable? Should we be using that instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in my experience the fast data is more reliable and in some ways more useful than what we get from the pulses components because it will show the real number of pulses that are getting delivered. I don't know why but sometimes the number of pulses will change for... reasons... and for whatever reason that doesn't seem to be reflected in the bunch pattern table.

As for whether we should be using that instead of the slow data, I'm not sure. I am a little hesitant to make it do something other than the obvious 'just return whatever was saved', but as you say that's probably not very useful.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will show the real number of pulses that are getting delivered. I don't know why but sometimes the number of pulses will change for... reasons... and for whatever reason that doesn't seem to be reflected in the bunch pattern table.

Hu, do you have a concrete example? That should not be possible, as they both are looking at exactly the same data - the XGM DOOCS server takes the bunch pattern to slice out the correct values from the train signal for data.intensityTD. That's the entire reason I don't trust it, it's an interpretation of the same signal through a 3rd party I don't control.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so weirdly enough the XrayPulses component does report the right number but not PumpProbePulses 🐙 Example from p6156, r185 with the XGM:
image

And the pulses components:
image

My code was using PumpProbePulses so it was giving the wrong number of pulses. Is that a bug in PumpProbePulses?

Copy link
Collaborator

@philsmt philsmt Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's doing exactly what it is supposed to do:

image

PumpProbePulses creates a unified pattern from both, i.e. it looks at every possible pulse the machine could have, and counts it as a filled pulse if either FEL, PPL or both are present.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, makes sense.

Copy link
Collaborator

@philsmt philsmt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

We could have actually offered a PulsePattern object based on XGM data, this would have automatically given all the different methods like constant pattern, pulse counts etc.

if not np.allclose(pulse_counts[0], pulse_counts):
raise ValueError("Number of pulses is changing, there is no nominal number.")

self._npulses[pg] = int(pulse_counts[0])

return self._npulses[pg]

def pulse_counts(self, sase=None):
"""Return a 1D [DataArray][xarray.DataArray] of the number of pulses in each train.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry I didn't notice that earlier, but it is unfortunate we're using xarray.DataArray here and pd.Series for PulsePattern-derived components. We should avoid mixing these types arbitrarily between components in the future.

Out of curiosity, any particular reason or just preference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason, it's just a preference since I tend to use DataArrays/Datasets a lot.

In particular, by:
- Explicitly checking for a wrong number of pulses by comparing fast/slow data
  and emitting a warning if they differ.
- Add more obvious warnings to the docs.
@JamesWrigley
Copy link
Member Author

I think I'm happy with the state of this now, in 53d15d4 I changed the pulse_counts() to return the fast data counts by default if a mismatch was detected. I'll leave this open for another round of review, but feel free to rebase and merge it if everyone agrees.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants