Core Module
ShowerData - A Python library for shower data storage.
- showerdata.add_target_dataset(path, shape, key='target', exists_ok=False)[source]
Add an empty target dataset to an existing HDF5 file.
- Parameters:
path (str | os.PathLike[str]) – Path to the HDF5 file.
key (str) – Name of the target dataset in the HDF5 file. Defaults to “target”.
exists_ok (bool) – If True, do not raise an error if the target dataset already exists. Defaults to False.
- Return type:
- showerdata.load_target(path, key='target', start=0, stop=None, max_points=-1)[source]
Load latent space target data for a specific generative model from an HDF5 file. The target usually has the same shape as the shower points.
- Parameters:
path (str | os.PathLike[str]) – Path to the HDF5 file.
key (str) – Name of the target dataset in the HDF5 file. Defaults to “target”.
start (int) – Start index for loading target data. Defaults to 0.
stop (Optional[int]) – Stop index for loading target data. If None, load until end of file. Defaults to None.
max_points (Optional[int]) – Maximum number of points to load per shower. If -1, load all points. Defaults to -1.
- Returns:
Loaded target data and corresponding number of points.
- Return type:
tuple[NDArray[np.float32], NDArray[np.int32]]
- showerdata.save_target(data, path, num_points=None, name='target', overwrite=False)[source]
Save latent space target data for a specific generative model to an HDF5 file. The target usually has the same shape as the shower points.
- Parameters:
data (NDArray[np.float32]) – Target data to save.
num_points (NDArray[np.int32] | None) – Number of points for each shower in the target data.
path (str | os.PathLike[str]) – Path to the HDF5 file.
name (str) – Name of the target dataset in the HDF5 file. Defaults to “target”.
overwrite (bool) – If True, overwrite existing dataset in file. Defaults to False.
- Return type:
- showerdata.save_target_batch(data, path, num_points=None, start=0, key='target')[source]
Save a batch of latent space target data for a specific generative model to an HDF5 file. The target usually has the same shape as the shower points. The file must already exist and have the correct shape. Use add_target_dataset to create the target dataset first.
Example
>>> showerdata.create_empty_file("showers.h5", shape=(1000, 500, 5)) >>> showerdata.add_target_dataset("showers.h5", shape=(1000, 500, 3), key="target") >>> # Now you can use save_target_batch >>> target_data = np.random.rand(100, 500, 3).astype(np.float32) # Example target data >>> num_points = np.random.randint(1, 501, size=(100,), dtype=np.int32) # Example num_points >>> showerdata.save_target_batch(target_data, num_points, "showers.h5", start=0, key="target")
- Parameters:
data (NDArray[np.float32]) – Target data to save.
path (str | os.PathLike[str]) – Path to the HDF5 file.
num_points (NDArray[np.int32] | None) – Number of points for each shower in the target data.
start (int) – Start index in the file. Defaults to 0.
key (str) – Name of the target dataset in the HDF5 file. Defaults to “target”.
- Return type:
- showerdata.cluster(showers, random_shift=True, detector_config=Geometry(ILD), processes=1)[source]
Cluster hits into readout cells using a regular grid.
- Parameters:
- Return type:
- Returns:
Clustered showers.
- showerdata.concatenate(data)[source]
Concatenate a iterable of Showers instances into a single instance.
- showerdata.create_empty_file(path, shape, overwrite=True)[source]
Create an empty HDF5 file with the specified dataset shape. To be used before calling save_batch.
- showerdata.filter_showers(shower_data, radius=inf, ecal_threshold=0.0, hcal_threshold=0.0, num_layers_ecal=30)[source]
Filter hists in the shower data based on the specified criteria.
- Parameters:
shower_data (
Showers) – The shower data to filter.radius (
float) – Radius (in millimeters) for the cylindrical cut filter.ecal_threshold (
float) – Energy threshold (in GeV) for the ECAL hit filter.hcal_threshold (
float) – Energy threshold (in GeV) for the HCAL hit filter.num_layers_ecal (
int) – Number of layers in the ECAL detector.
- Return type:
- Returns:
The filtered shower data.
- showerdata.get_file_length(path)[source]
Get the number of samples in an HDF5 shower data file. Unlike get_file_shape, this function works also for files only containing incident particle data.
- Parameters:
path (str | os.PathLike[str]) – Path to the HDF5 file.
- Returns:
Number of samples in the file.
- Return type:
- showerdata.get_file_shape(path)[source]
Get the shape of the showers dataset in an HDF5 file. Only works for files containing shower data.
- class showerdata.IncidentParticles(energies, pdg, directions=None)[source]
Bases:
objectData structure for incident particle data.
- Parameters:
energies (ArrayLike) – Energies of the incident particles.
pdg (ArrayLike or int) – Particle Data Group identifier(s).
directions (Optional[ArrayLike]) – Directions of the incident particles as a unit vector. Defaults to (0, 0, 1).
- energies
Energies of the incident particles.
- Type:
NDArray
- directions
Directions of the incident particles given as a unit vector.
- Type:
NDArray
- pdg
Particle Data Group identifiers for the incident particles.
- Type:
NDArray
- showerdata.load(path, start=0, stop=None, max_points=-1)[source]
Load shower data from an HDF5 file.
- Parameters:
path (str | os.PathLike[str]) – Path to the HDF5 file.
start (int) – Start index for loading showers. Defaults to 0.
stop (Optional[int]) – Stop index for loading showers. If None, load until end of file. Defaults to None.
max_points (int) – Maximum number of points to load per shower. If -1, load all points. Defaults to -1.
- Returns:
Loaded shower data.
- Return type:
- showerdata.load_inc_particles(path, start=0, stop=None)[source]
Load incident particle data from an HDF5 file.
- Parameters:
path (str | os.PathLike[str]) – Path to the HDF5 file.
start (int) – Start index for loading incident particles. Defaults to 0.
stop (Optional[int]) – Stop index for loading incident particles. If None, load until end of file. Defaults to None.
- Returns:
Loaded incident particle data.
- Return type:
- showerdata.save(data, path, overwrite=False)[source]
Save data to an HDF5 file.
- Parameters:
data (Showers | IncidentParticles) – Data to save.
path (str | os.PathLike[str]) – Path to the HDF5 file.
overwrite (bool) – If True, overwrite existing file. Defaults to False.
- Return type:
- showerdata.save_batch(data, path, start=0)[source]
Save a batch of shower data to an HDF5 file. The file must already exist and have the correct shape. Use create_empty_file to create the file first.
Example
>>> showerdata.create_empty_file("showers.h5", shape=(1000, 500, 5)) >>> # Now you can use save_batch to fill the file with data. >>> showers = showerdata.Showers(...) # Create or load some showers >>> showerdata.save_batch(showers, "showers.h5", start=0)
- class showerdata.ShowerDataFile(path, mode='r', shape=None)[source]
Bases:
objectContext manager for handling shower data in HDF5 files.
Example
>>> # read showers from a file >>> with showerdata.ShowerDataFile("showers.h5") as file: ... print(file.shape) ... print(file.num_showers) ... shower = file[0] # Get the first shower >>> # create a new file and write showers >>> with showerdata.ShowerDataFile( ... path="new_showers.h5", ... mode="w", ... shape=(1000, 500, 5), ... ) as file: ... new_showers = showerdata.Showers(...) # Create or load some showers ... file[0:100] = new_showers # Write first 100 showers ... file[100:200] = new_showers # Write next 100 showers
- Parameters:
- class showerdata.Showers(points=(), energies=(), pdg=(), directions=None, shower_ids=None, num_points=None, copy=None)[source]
Bases:
objectData structure for shower data.
- Parameters:
points (ArrayLike) – Shower point cloud.
energies (ArrayLike) – Energies of the incident particles.
pdg (ArrayLike or int) – Particle Data Group identifier(s).
directions (Optional[ArrayLike]) – Directions of the incident particles as a unit vector. Defaults to (0, 0, 1).
shower_ids (Optional[ArrayLike]) – Unique identifiers for each shower. Defaults to sequential IDs.
copy (bool | None) – If True, data will be copied to ensure immutability. Defaults to None.
- points
Array of shower points. Format: (num_showers, max_points, 4 or 5).
- Type:
NDArray
- energies
Energies of the incident particles.
- Type:
NDArray
- directions
Directions of the incident particles given as a unit vector.
- Type:
NDArray
- pdg
Particle Data Group identifiers for the incident particles.
- Type:
NDArray
- shower_ids
Unique identifiers for each shower.
- Type:
NDArray
- copy()[source]
Create a copy of the Showers instance. :returns: A new Showers instance with copied data. :rtype: Showers
- save(path, overwrite=False)[source]
Save data to an HDF5 file.
- Parameters:
path (str | os.PathLike[str]) – Path to the HDF5 file.
overwrite (bool) – If True, overwrite existing file. Defaults to False.
- Return type:
- property inc_particles: IncidentParticles
Get the incident particles associated with the showers.