SimulateMixin¶
-
class
cait.mixins.SimulateMixin[source]¶ Bases:
objectA Mixin Class for the DataHandler class with methods to simulate data sets.
-
simulate_pulses(path_sim, size_events=0, size_tp=0, size_noise=0, ev_ph_intervals=[[0, 1], [0, 1]], ev_discrete_phs=None, name_appendix='', exceptional_sev_naming=None, channels_exceptional_sev=[0], tp_ph_intervals=[[0, 1], [0, 1]], tp_discrete_phs=None, t0_interval=[- 20, 20], fake_noise=False, store_of=True, rms_thresholds=[1, 1], lamb=0.01, sample_length=None, assign_labels=[1], start_from_bl_idx=0, saturation=False, reuse_bl=False, pulses_per_bl=1, ps_dev=False, dtype='float32')[source]¶ Simulates a data set of pulses by superposing the fitted SEV with fake or real noise.
This method was used to simulate events in “F. Wagner, Machine Learning Methods for the Raw Data Analysis of crypgenic Dark Matter Experiments”, available via https://doi.org/10.34726/hss.2020.77322 (accessed on the 9.7.2021).
- Parameters
path_sim (string) – The full path where to store the simulated data set.
size_events (int) – The number of events to simulate; if >0 we need a sev in the hdf5.
size_tp (int) – The number of testpulses to simulate; if >0 we need a tp-sev in the hdf5.
size_noise (int) – The number of noise baselines to simulate.
ev_ph_intervals (list of NMBR_CHANNELS 2-tuples or lists) – The interval in which the pulse heights are continuously distributed.
ev_discrete_phs (list of NMBR_CHANNELS lists) – The discrete values, from which the pulse heights are uniformly sampled. If the ph_intervals argument is set, this option will be ignored.
name_appendix (string) – A string that is appended to the group name stdevent, which contains the standard event that is used for simulation. This concerns only the simulation of event pulses and has no effect on the test pulses.
exceptional_sev_naming (string or None) – If set, this is the full group name in the HDF5 set for the sev used for the simulation of events - by setting this, e.g. carrier events can be simulated. Attention! The exceptional standard events are with version 1.0 no longer maintained. Please use the name_appendix argument instead!
channel_exceptional_sev (list of ints) – The channels for that the exceptional sev is used, e.g. if only for phonon channel, choose [0], if for botch phonon and light, choose [0,1].
tp_ph_intervals (list of NMBR_CHANNELS 2-tuples or lists) – Analogous to ev_ph_intervals, but for the testpulses.
tp_discrete_phs (list of NMBR_CHANNELS lists) – Analogous to ev_ph_intervals, but for the testpulses.
t0_interval (2-tuple or list) – The interval from which the pulse onset are continuously sampled.
fake_noise (bool) – If True the noise will be taken not from the measured baselines from the hdf5 set, but simulated.
store_of (bool) – If True the optimum filter will be saved to the simulated datasets.
rms_thresholds (list of two floats) – Above which value noise baselines are excluded for the distribution of polynomial coefficients (i.e. a parameter for the fake noise simulation), also a cut parameter for the noise baselines from the h5 set if no fake ones are taken.
lamb (float) – A parameter for the fake baseline simulation, decrease if calculation time is too long.
sample_length (float) – The length of one sample in milliseconds (if None, it is calculated from the sample frequency).
assign_labels (list of ints) – Pre-assign a label to all the simulated events; tp and noise are automatically labeled, the length of the list must match the list channels_exceptional_sev.
start_from_bl_idx (int) – The index of baselines that is as first taken for simulation.
saturation (bool) – If true apply the logistics curve to the simulated pulses.
reuse_bl (bool) – If True the same baselines are used multiple times to have enough of them (use this with care to not have identical copies of events).
pulses_per_bl (int) – Number of pulses to simulate per one baseline –> gets multiplied to size!!
ps_dev (bool) – If True the pulse shape parameters are modelled with deviations. Attention! This will always model TUM40-like phonon pulse shapes! The light channel is not affected by this features. Generally, it is not clear how well the deviations model the actual deviations in measured data, so please handle this feature with care.
dtype (string) – The data format of the simulated raw data events array.
-