Data Sources
Data sources are objects that make interfacing the various sources of data as easy as possible. Examples are RDTFile for interfacing hardware triggered data files, Stream to access data from a continuous stream file (where it is irrelevant which hardware was used to record the data), but also the object that should be most familiar to cait users, the DataHandler is a data source. Data sources do not have to be files but can also provide simulated data on the fly, like MockData which returns mock voltage traces for quickly testing functions.
The crucial thing is that the technical details on how to access the data behind a data source are hidden from the user. The objects of most interest are voltage traces (“events”) and data sources make it easy to access those voltage traces, no matter how they are stored. A data source provides event iterators.
Top level classes
- class cait.versatile.Stream(hardware: str, src: str | List[str], *args, **kwargs)[source]
Factory class for providing a common access point to stream data. Currently, only vdaq3, vdaq2 and csmpl stream files are supported but an extension can be straight forwardly implemented by sub-classing
cait.versatile.datasources.stream.streambase.StreamBaseClassand adding it for selection in the constructor ofStream.The data is accessed by means of slicing (see below). The time property is an object of
StreamTimeand offers a convenient time interface as well (see below).Important Note: If you plan to access the stream data repeatedly, you can ensure that the stream file stays open (increases speed) by using it as a context manager:
import cait.versatile as vai stream = vai.Stream(hardware='vdaq2', src='file.bin') with stream: trigger_inds, amplitudes = vai.trigger_zscore(stream["ADC1"], 2**14)
- Parameters:
hardware (str) – The hardware which was used to record the stream file. Valid options are [‘csmpl’, ‘vdaq2’, ‘vdaq3’]
src (Union[str, List[str]]) – The source for the stream. Depending on how the data is taken, this can either be the path to one file or a list of paths to multiple files. This input is handled by the specific implementation of the Stream Object. See below for examples.
kwargs (args,) – Additional arguments for the chosen hardware (see respective documentation).
Usage for different hardware:
CSMPL: Files are
.csmplfiles which contain one channel each. Additionally, we need a.parfile to read the start timestamp of the stream data from.s = Stream(hardware='csmpl', src=['par_file.par', 'stream_Ch0.csmpl', 'stream_Ch1.csmpl'])
See also:
cait.versatile.datasources.stream.impl_csmpl.Stream_CSMPLVDAQ2: Files are
.binfiles which contain all information necessary to construct the Stream object. It can be input as a single argument. Testpulse channels in this file format need to be (automatically) triggered to obtain testpulse amplitudes and timestamps.s = Stream(hardware='vdaq2', src='file.bin')
See also:
cait.versatile.datasources.stream.impl_vdaq2.Stream_VDAQ2VDAQ3: Files are
.binfiles which contain one channel each. There are two versions of the file format: One for which the testpulse timestamps are already saved inside the.binfile (preferred format), and one for which you have to load the testpulse channel as an additional stream channel and (automatically) trigger them to get the timestamps/tpas (like for the VDAQ2 format).s = Stream(hardware='vdaq3', src=['file_ch0.bin', 'file_ch1.bin'])
See also:
cait.versatile.datasources.stream.impl_vdaq3.Stream_VDAQ3Usage slicing:
Valid options for slicing streams are the following:
# Get voltage data for one channel (this does NOT load it # into memory but you can use the resulting object, more or # less, like a numpy-array). ch1 = s['ADC1'] ch2 = s['ADC2'] # This also works for multiple channels. Note, however, that # you still slice it as if it was 1d, i.e. if you slice the # first 10 elements of the object, you will get the first 10 # for BOTH channels. chs = s[['ADC1', 'ADC2']] chs[:10] # equivalent to np.array([ch1[:10], ch2[:10]]) # Get ADC data for one channel and slice it (two equivalent ways) s['ADC1', 10:20] s['ADC1'][10:20] # Get voltage data for one channel, slice it, and return the # voltage values instead of the ADC values. The cleaner way # to do this would be to use the first syntax above. s['ADC1', 10:20, 'as_voltage']
- get_trace(key: str, where: slice, voltage: bool = True)[source]
Get the ADC trace for a given channel ‘key’ and slice ‘where’. If
voltage==True, the ADC value is converted to a voltage (V) fist.- Returns:
ADC or voltage trace.
- Return type:
np.ndarray
- property keys
Available keys (channel names) in the stream.
- Returns:
List of keys.
- Return type:
list
- property start_us
The microsecond timestamp at which the stream starts.
- Returns:
Microsecond timestamp
- Return type:
int
- property dt_us
The length of a sample in the stream in microseconds.
- Returns:
Microsecond time-delta
- Return type:
int
- property tp_keys
Available testpulse keys in
self.tpasandself.tp_timestamps.- Returns:
List of keys.
- Return type:
list
- property tpas
Dictionary of testpulse amplitudes in the stream. For hardware ‘csmpl’ this is read from a ‘.test_stamps’ file. For hardware ‘vdaq2’ this is obtained from triggering the DAC channels first.
- Returns:
Testpulse amplitudes
- Return type:
dict of np.ndarray
- property tp_timestamps
Dictionary of testpulse timestamps (microseconds) in the stream. For hardware ‘csmpl’ this is read from a ‘.test_stamps’ file. For hardware ‘vdaq2’ this is obtained from triggering the DAC channels first.
- Returns:
Testpulse microsecond timestamps.
- Return type:
dict of np.ndarray
- property calp_keys
Available calpulse keys in
self.calpasandself.calp_timestamps.- Returns:
List of keys.
- Return type:
list
- property calpas
Dictionary of calpulse amplitudes in the stream. For hardware ‘vdaq2’ and ‘vdaq3’ this is obtained from triggering the ADC channels first.
- Returns:
Calpulse amplitudes
- Return type:
dict of np.ndarray
- property calp_timestamps
Dictionary of calpulse timestamps (microseconds) in the stream. For hardware ‘vdaq2’ and ‘vdaq3’ this is obtained from triggering the ADC channels first.
- Returns:
Calpulse microsecond timestamps.
- Return type:
dict of np.ndarray
- class cait.versatile.RDTFile(path: str, path_par: str = None)[source]
Class for interfacing hardware triggered files (file extension .rdt). This class automatically infers the available channels and the available correlated channels. Those can be retrieved by indexing the RDTFile object with channel indices/names or tuples thereof, the result of the indexing is a
RDTChannelobject which provides testpulse amplitudes, timestamps, and event iterators for (the) selected channel(s) (see documentation forRDTChannel).- Parameters:
path (str) – The full path (including the file extension .rdt) to the file of interest.
path_par (str, optional) – The full path (including the file extension .par) to the file which contains the necessary parameters to read the .rdt file. If None is given, it is assumed that a .par file with identical name/path as path is available. Defaults to None.
- Returns:
Object interfacing an .rdt file.
- Return type:
Example:
import cait.versatile as vai f = vai.RDTFile('path/to/file.rdt') # Check available channels print(f.keys) # Choose channel(s) to iterate over, get testpulse amplitudes, ..., by slicing RDTFile channels = f[(0,1)] # if interested in only one channel: channel0 = f[0] it = channels.get_event_iterator() # You can now further slice this iterator (like any other iterator in cait.versatile): it_testpulses = it[:, channels.tpas > 0] it_events = it[:, channels.tpas == 0] it_noise = it[:, channels.tpas == -1] # Have a look (after removing the baseline): vai.Preview(it_testpulses.with_processing(vai.RemoveBaseline()))
- property record_length
The record length (number of samples per event) of the events in the corresponding *.rdt file.
- property dt_us
The time base in microseconds (time between two samples) of the events in the corresponding *.rdt file.
- property sample_frequency
The sample frequency in Hz of the events in the corresponding *.rdt file.
- property measuring_time_h
The total measuring time in hours of the corresponding *.rdt file.
- property keys
The channel keys that can be used to index this RDTFile instance. If available, the channel names (corresponding to the indices) are shown as well.
- get_trace(inds: int | list, voltage: bool = True)[source]
Return the ADC traces of events in this RDTFile for given indices. If
voltage==True, the ADC value is converted to a voltage (V) fist.- Parameters:
inds (Union[int, list]) – The indices for which to return the voltage traces.
voltage (bool, optional) – If True, voltage values are returned instead of ADC values.
- Returns:
Array of as many ADC/voltage traces as given inds.
- Return type:
numpy.array
- class cait.versatile.MockData(n_events: int = 100, record_length: int = 16384, dt_us: int = 10)[source]
Class to generate quick mock pulse traces (2 channels).
- Parameters:
n_events (int, optional) – Number of events to simulate. Defaults to 100.
record_length (int, optional) – Record length of the pulse traces to simulate. Defaults to 16384.
dt_us (int, optional) – Microsecond time base of the pulse traces to simulate. Defaults to 10.
- Returns:
Object providing mock data.
- Return type:
- get_event_iterator(batch_size: int = None)[source]
Return an event iterator over the events in this mock data instance.
- Parameters:
batch_size (int) – The number of events to be returned at once (these are all read together). There will be a trade-off: large batch_sizes cause faster read speed but increase the memory usage.
- Returns:
Event iterator
- Return type:
MockIterator
- get_event(inds: int, channel: slice = None)[source]
Return a single event for a given index.
- Parameters:
inds (int) – The index of the event that we want to read from the mock data.
channel (int) – The channel of the event that we want to read from the mock data. If None, then all channels are returned.
- Returns:
Event
- Return type:
np.ndarray
- property dt_us
The length of a sample in the data in microseconds.
- Returns:
Microsecond time-delta
- Return type:
int
- property start_us
The microsecond timestamp of the start of the recording for this datasource object.