Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a PPU component #71

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Add a PPU component #71

wants to merge 7 commits into from

Conversation

tmichela
Copy link
Member

@tmichela tmichela commented Oct 9, 2023

I have ~mirrored the device lookup to what is done in the XGM component, but realistically, there should not ever be more than 1 PPU per run.~~
In the extreme case we could hard code device names for this component for HED (and potentially SPB, where a PPU is also installed, but never used).

There also a device with similar interface for the laser trigger only, so either:

  • we keep a single device to handle both, with the drawback of having a slower initialization if we do not give in the full device name
  • we have a separate class for each device (PPU vs DPU?) we can then infer the single static device name (per instrument)

Copy link

@dallanto dallanto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, as this is well thought including the tests. I don't spot anything obvious, my comments are minor, and partly for my interest.

elif device is not None:
raise KeyError(f"ppu must be a SourceData or str, not {type(device).__name__}")

# Then we list all PPU device in the run
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the 'then' is a bit misleading, it sounds like this is logically done after the code above, but this is the default entry point if no device is given by argument (but if it is given, the function will have returned at this point).
It is veeeery minor, but maybe "# Default: we get all PPU devices in the run as checked against hardcoded device classes, respectively substrings"


Args:
offset (int, optional):
offset to add to the selected trains. Defaults to 0.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For interest: why would the user want to add an offset? I'm sure there are use cases, just asking

reduced_run = ppu.trains(split_sequence=True)
assert isinstance(reduced_run, list)
assert len(reduced_run) == 3
assert len(reduced_run[0].train_ids) == 1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure this is overly pedantic / redundant, but maybe:
assert isinstance(reduced_run[0], DataCollection) in analogy to the split_sequence == False case

block the XFEL beam when triggered. The rotor is contained within a UHV
chamber. In terms of temporal structure, the beam pipe is blocked by an
absorbing rotor for up to 9/10ths of a second or vice versa,
synchronized to the facility clock/trigger.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very useful information for the docs, I only wonder (naive question) how the 9/10 blocking scheme, i.e. one train per second picked, fits to an overall run where 1 / 309 trains are selected - like run 3379, 50 for instance, shouln't it then be 30 trains with non-consecutive IDs?

split_sequence (bool, optional): Split data per PPU trigger sequence. Defaults to False.
offset (int, optional): offset to apply to train IDs to be selected. Defaults to 0.

Returns:
Union[DataCollection, List[DataCollection]]:
DataCollection(s) containing only trains triggered by the PPU
"""
data = data or self.data
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to make this generic. Strictly speaking, the EXtra component is thus not a PPU component any more, but a "trainSelector" component. I wonder if there should be such class per se. I see that this addition to the trains() method makes the PPU more flexible, but it also seems a bit awkward to initialize by PPU device information, in order to then feed it with something else, except for comparison. In other words, I am not sure if a generic class with specific methods would be better than a specific class with a generic method by argument 🙃

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean here. The line you highlighted here is only to make it convenient to use with KeyData or SourceData. Otherwise one would need to first filter the DataCollection and then make source/keydata object. Here you can instantiate all your object at the start of your script and then filter trains for any object as required.

Copy link
Member Author

@tmichela tmichela Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ppu = PPU(run)
filtered_run = ppu.trains()
source = filtered_run['awesome_source']

vs

ppu = PPU(run)
source = run['awesome_source']
filtered_source = ppu.trains(source)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. My comment was not about that particular line in the first place, more about the way this method accepts other source data i.e. allowing to do train picking based on a pre-selected source. This may be more convenient indeed. The default filtering i.e. trains() method without argument is using the PPU device source, if one is actually interested in another source as in your first code block, it is more "natural" to do it the 2nd way.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, on a much higher level, I thought whether it would be good have a very generic "trainPicker" or "trainSelector" component in EXtra, that is, a class that could take PPU or any other source for filtering, already at instantiation level.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this comment is not affecting the PR, it is more an outlook to the future. (Maybe it does not make sense, because it lacks a real use case?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not, but do we have other filtering sources/methods at this moment?

train_ids = pd.Series(train_ids, index=sequences)
return train_ids

def trains(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you consider a different name for this method? Both in EXtra-data as well as across existing components, we use .trains() to return an iterator over trains. This method however takes something that has train IDs and changes it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Do you have something in mind?

  • filter? filter_trains?
  • train_selection? select? select_trains?
  • pick? pick_trains?
  • take? take_trains?

I'd probably go for pick_trains.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

select and select_trains also have different meanings already (though in the long run, I would love to pass this component to DataCollection.select_trains, i.e. have some kind of TrainSelector protocol)

From your list, I would probably also go for pick_trains, but it's unfortunate one way or the other that it's the "picking-thing" calling on the "thing-to-be-picked" 🤔
Maybe @takluyver can comment on what sounds most natural.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to pass this component to DataCollection.select_trains, i.e. have some kind of TrainSelector protocol

yes, me too!
But do we have an idea of what/how we want to select trains or how this protocol can look like? I can easily think on how to do that specifically for the PPU device, but I can't imagine other cases for now (maybe you have more use cases as part of the data reduction).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we invert the problem: Instead of having a method that selects trains on some object, have a property or method that returns the correspondig extra_data.by_id slice? Then selection becomes: run.select_trains(ppu.train_sel) (name obviously preliminary)

Whenever we decide on a universal TrainSelector protocol, we can simply add it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to something like ppu.train_sel. For the split-sequence option, we could make PPU iterable, giving some sort of TrainSequence or TrainGroup object so you could do [run.select_trains(seq.train_sel) for seq in ppu].

Copy link
Member

@JamesWrigley JamesWrigley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(also needs a changelog entry)

Comment on lines -25 to -26
- components/scans.md
- components/pulse-patterns.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if all pages were listed in the index. In general I dislike having disembodied pages because otherwise navigating to them requires that someone knows the right link to click on the right page (e.g. I always get confused when trying to find the xwiz docs).

@JamesWrigley JamesWrigley mentioned this pull request Feb 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants