Data Sources
Data sources are objects that make interfacing the various sources of data as easy as possible. Examples are RDTFile for interfacing hardware triggered data files, Stream to access data from a continuous stream file (where it is irrelevant which hardware was used to record the data), but also the object that should be most familiar to cait users, the DataHandler is a data source. Data sources do not have to be files but can also provide simulated data on the fly, like MockData which returns mock voltage traces for quickly testing functions.
The crucial thing is that the technical details on how to access the data behind a data source are hidden from the user. The objects of most interest are voltage traces (“events”) and data sources make it easy to access those voltage traces, no matter how they are stored. A data source provides event iterators.
Top level classes
- class cait.versatile.RDTFile(path: str, path_par: Optional[str] = None)[source]
Class for interfacing hardware triggered files (file extension .rdt). This class automatically infers the available channels and the available correlated channels. Those can be retrieved by indexing the RDTFile object with channel indices/names or tuples thereof, the result of the indexing is a
RDTChannelobject which provides testpulse amplitudes, timestamps, and event iterators for (the) selected channel(s) (see documentation forRDTChannel).- Parameters
path (str) – The full path (including the file extension .rdt) to the file of interest.
path_par (str, optional) – The full path (including the file extension .par) to the file which contains the necessary parameters to read the .rdt file. If None is given, it is assumed that a .par file with identical name/path as path is available. Defaults to None.
- Returns
Object interfacing an .rdt file.
- Return type
Example:
import cait.versatile as vai f = vai.RDTFile('path/to/file.rdt') # Check available channels print(f.keys) # Choose channel(s) to iterate over, get testpulse amplitudes, ..., by slicing RDTFile channels = f[(0,1)] # if interested in only one channel: channel0 = f[0] it = channels.get_event_iterator() # You can now further slice this iterator (like any other iterator in cait.versatile): it_testpulses = it[:, channels.tpas > 0] it_events = it[:, channels.tpas == 0] it_noise = it[:, channels.tpas == -1] # Have a look (after removing the baseline): vai.Preview(it_testpulses.with_processing(vai.RemoveBaseline()))
- property record_length
The record length (number of samples per event) of the events in the corresponding *.rdt file.
- property dt_us
The time base in microseconds (time between two samples) of the events in the corresponding *.rdt file.
- property sample_frequency
The sample frequency in Hz of the events in the corresponding *.rdt file.
- property measuring_time_h
The total measuring time in hours of the corresponding *.rdt file.
- property keys
The channel keys that can be used to index this RDTFile instance. If available, the channel names (corresponding to the indices) are shown as well.
- class cait.versatile.Stream(hardware: str, src: Union[str, List[str]])[source]
Factory class for providing a common access point to stream data. Currently, only vdaq2 and cresst stream files are supported but an extension can be straight forwardly implemented by sub-classing
StreamBaseClassand adding it for selection in the constructor ofStream.The data is accessed by means of slicing (see below). The time property is an object of
StreamTimeand offers a convenient time interface as well (see below).- Parameters
hardware (str) – The hardware which was used to record the stream file. Valid options are [‘cresst’, ‘vdaq2’]
src (Union[str, List[str]]) – The source for the stream. Depending on how the data is taken, this can either be the path to one file or a list of paths to multiple files. This input is handled by the specific implementation of the Stream Object. See below for examples.
Usage for different hardware:
CRESST: Files are .csmpl files which contain one channel each. Additionally, we need a .par file to read the start timestamp of the stream data from.
s = Stream(hardware='cresst', src=['par_file.par', 'stream_Ch0.csmpl', 'stream_Ch1.csmpl'])
VDAQ2: Files are .bin files which contain all information necessary to construct the Stream object. It can be input as a single argument.
s = Stream(hardware='vdaq2', src='file.bin')
Usage slicing:
Valid options for slicing streams are the following:
# Get data for one channel s['ADC1'] # Get data for one channel and slice it (two equivalent ways) s['ADC1', 10:20] s['ADC1'][10:20] # Get data for one channel, slice it, and return the voltage # values instead of the ADC values s['ADC1', 10:20, 'as_voltage']
- get_voltage_trace(key: str, where: slice)[source]
Get the voltage trace for a given channel ‘key’ and slice ‘where’.
- Returns
Voltage trace.
- Return type
np.ndarray
- property keys
Available keys (channel names) in the stream.
- Returns
List of keys.
- Return type
list
- property start_us
The microsecond timestamp at which the stream starts.
- Returns
Microsecond timestamp
- Return type
int
- property dt_us
The length of a sample in the stream in microseconds.
- Returns
Microsecond time-delta
- Return type
int
- property tpas
Dictionary of testpulse amplitudes in the stream. For hardware ‘cresst’ this is read from a ‘.test_stamps’ file. For hardware ‘vdaq2’ this is obtained from triggering the DAC channels first.
- Returns
Testpulse amplitudes
- Return type
dict of np.ndarray
- property tp_timestamps
Dictionary of testpulse timestamps (microseconds) in the stream. For hardware ‘cresst’ this is read from a ‘.test_stamps’ file. For hardware ‘vdaq2’ this is obtained from triggering the DAC channels first.
- Returns
Testpulse microsecond timestamps.
- Return type
dict of np.ndarray
- class cait.versatile.MockData(n_events: int = 100, record_length: int = 16384, dt_us: int = 10)[source]
Class to generate quick mock pulse traces (2 channels).
- Parameters
n_events (int, optional) – Number of events to simulate. Defaults to 100.
record_length (int, optional) – Record length of the pulse traces to simulate. Defaults to 16384.
dt_us (int, optional) – Microsecond time base of the pulse traces to simulate. Defaults to 10.
- Returns
Object providing mock data.
- Return type
- get_event_iterator(batch_size: Optional[int] = None)[source]
Return an event iterator over the events in this mock data instance.
- Parameters
batch_size (int) – The number of events to be returned at once (these are all read together). There will be a trade-off: large batch_sizes cause faster read speed but increase the memory usage.
- Returns
Event iterator
- Return type
MockIterator
- get_event(inds: int, channel: Optional[slice] = None)[source]
Return a single event for a given index.
- Parameters
inds (int) – The index of the event that we want to read from the mock data.
channel (int) – The channel of the event that we want to read from the mock data. If None, then all channels are returned.
- Returns
Event
- Return type
np.ndarray
- property start_us
The microsecond timestamp of the start of the recording for this datasource object.
Related classes and base classes
- class cait.versatile.datasources.hardwaretriggered.rdt_file.RDTChannel(rdt_file: cait.versatile.datasources.hardwaretriggered.rdt_file.RDTFile, key: Union[int, tuple])[source]
Object representing a coherent part of an RDTFile (i.e. either a single channel or correlated channels). Usually this is not created as a standalone but the result of slicing an RDTFile.
- Parameters
rdt_file (RDTFile) – An RDTFile instance.
key (Union[int, tuple]) – The key which selects the single channel or correlated channels. Either of rdt_file.keys.
- Returns
Specified channels of an RDTFile
- Return type
- get_event_iterator(batch_size: Optional[int] = None)[source]
Get an iterator over the events present in this RDTChannel instance.
- Parameters
batch_size (int) – The number of events to be returned at once (these are all read together). There will be a trade-off: large batch_sizes cause faster read speed but increase the memory usage.
- Returns
Iterable object
- Return type
RDTIterator
Example:
import cait.versatile as vai f = vai.RDTFile('path/to/file.rdt') # Choose channel(s) to iterate over by slicing RDTFile channels = f[(0,1)] it = channels.get_event_iterator() # You can now further slice this iterator (like any other iterator in cait.versatile): it_testpulses = it[:, channels.tpas > 0] it_events = it[:, channels.tpas == 0] it_noise = it[:, channels.tpas == -1] # Remove baselines: it_testpulses.add_processing(vai.RemoveBaseline()) # Have a look: vai.Preview(it_testpulses)
- property key
The RDTFile key that this RDTChannel corresponds to.
- property n_channels
The number of channels this RDTChannel corresponds to.
- property start_us
The microsecond timestamp of the start of the recording for this datasource object.
- property timestamps
The microsecond timestamps of the events in this RDTChannel.
- property tpas
The testpulse amplitudes of the events in this RDTChannel.
- property unique_tpas
The unique testpulse amplitudes of the events in this RDTChannel.
- class cait.versatile.datasources.stream.streambase.StreamBaseClass[source]
- abstract property dt_us
The length of a sample in the stream in microseconds.
- Returns
Microsecond time-delta
- Return type
int
- get_event_iterator(keys: Union[str, List[str]], record_length: int, inds: Optional[Union[int, List[int]]] = None, timestamps: Optional[Union[int, List[int]]] = None, alignment: float = 0.25, batch_size: Optional[int] = None)[source]
Returns an iterator object over voltage traces for given trigger indices or timestamps of a stream file.
- Parameters
keys (Union[str, List[str]]) – The keys (channel names) of the stream object to be iterated over.
record_length (int) – The number of samples to be returned for each index. Usually, those are powers of 2, e.g. 16384
inds (Union[int, List[int]]) – The stream indices for which we want to read the voltage traces. This index is aligned at 1/4th of the record window. Either inds or timestamps has to be set.
timestamps (Union[int, List[int]]) – The stream timestamps for which we want to read the voltage traces. This timestamp is aligned at 1/4th of the record window. Either inds or timestamps has to be set.
alignment (float) – A number in the interval [0,1] which determines the alignment of the record window (of length record_length) relative to the specified index. E.g. if alignment=1/2, the record window is centered around the index. Defaults to 1/4.
batch_size (int) – The number of events to be returned at once (these are all read together). There will be a trade-off: large batch_sizes cause faster read speed but increase the memory usage.
- Returns
Iterable object
- Return type
StreamIterator
- abstract get_voltage_trace(key: str, where: slice)[source]
Get the voltage trace for a given channel ‘key’ and slice ‘where’.
- Returns
Voltage trace.
- Return type
np.ndarray
- abstract property keys
Available keys (channel names) in the stream.
- Returns
List of keys.
- Return type
list
- abstract property start_us
The microsecond timestamp at which the stream starts.
- Returns
Microsecond timestamp
- Return type
int
- property time
Instance of StreamTime, which can be sliced to convert stream indices into microsecond timestamps and implements utility functions for the conversion to datetime for example.
- Returns
StreamTime instance
- Return type
StreamTime
- abstract property tp_timestamps
Dictionary of testpulse timestamps (microseconds) in the stream. For hardware ‘cresst’ this is read from a ‘.test_stamps’ file. For hardware ‘vdaq2’ this is obtained from triggering the DAC channels first.
- Returns
Testpulse microsecond timestamps.
- Return type
dict of np.ndarray
- abstract property tpas
Dictionary of testpulse amplitudes in the stream. For hardware ‘cresst’ this is read from a ‘.test_stamps’ file. For hardware ‘vdaq2’ this is obtained from triggering the DAC channels first.
- Returns
Testpulse amplitudes
- Return type
dict of np.ndarray