Data Sources

Data sources are objects that make interfacing the various sources of data as easy as possible. Examples are RDTFile for interfacing hardware triggered data files, Stream to access data from a continuous stream file (where it is irrelevant which hardware was used to record the data), but also the object that should be most familiar to cait users, the DataHandler is a data source. Data sources do not have to be files but can also provide simulated data on the fly, like MockData which returns mock voltage traces for quickly testing functions.

The crucial thing is that the technical details on how to access the data behind a data source are hidden from the user. The objects of most interest are voltage traces (“events”) and data sources make it easy to access those voltage traces, no matter how they are stored. A data source provides event iterators.

Top level classes

class cait.versatile.Stream(hardware: str, src: str | List[str], *args, **kwargs)[source]

Factory class for providing a common access point to stream data. Currently, only vdaq3, vdaq2 and csmpl stream files are supported but an extension can be straight forwardly implemented by sub-classing cait.versatile.datasources.stream.streambase.StreamBaseClass and adding it for selection in the constructor of Stream.

The data is accessed by means of slicing (see below). The time property is an object of StreamTime and offers a convenient time interface as well (see below).

Important Note: If you plan to access the stream data repeatedly, you can ensure that the stream file stays open (increases speed) by using it as a context manager:

import cait.versatile as vai

stream = vai.Stream(hardware='vdaq2', src='file.bin')
with stream:
    trigger_inds, amplitudes = vai.trigger_zscore(stream["ADC1"], 2**14)
Parameters:
  • hardware (str) – The hardware which was used to record the stream file. Valid options are [‘csmpl’, ‘vdaq2’, ‘vdaq3’]

  • src (Union[str, List[str]]) – The source for the stream. Depending on how the data is taken, this can either be the path to one file or a list of paths to multiple files. This input is handled by the specific implementation of the Stream Object. See below for examples.

  • kwargs (args,) – Additional arguments for the chosen hardware (see respective documentation).

Usage for different hardware:

CSMPL: Files are .csmpl files which contain one channel each. Additionally, we need a .par file to read the start timestamp of the stream data from.

s = Stream(hardware='csmpl', src=['par_file.par', 'stream_Ch0.csmpl', 'stream_Ch1.csmpl'])

See also: cait.versatile.datasources.stream.impl_csmpl.Stream_CSMPL

VDAQ2: Files are .bin files which contain all information necessary to construct the Stream object. It can be input as a single argument. Testpulse channels in this file format need to be (automatically) triggered to obtain testpulse amplitudes and timestamps.

s = Stream(hardware='vdaq2', src='file.bin')

See also: cait.versatile.datasources.stream.impl_vdaq2.Stream_VDAQ2

VDAQ3: Files are .bin files which contain one channel each. There are two versions of the file format: One for which the testpulse timestamps are already saved inside the .bin file (preferred format), and one for which you have to load the testpulse channel as an additional stream channel and (automatically) trigger them to get the timestamps/tpas (like for the VDAQ2 format).

s = Stream(hardware='vdaq3', src=['file_ch0.bin', 'file_ch1.bin'])

See also: cait.versatile.datasources.stream.impl_vdaq3.Stream_VDAQ3

Usage slicing:

Valid options for slicing streams are the following:

# Get voltage data for one channel (this does NOT load it
# into memory but you can use the resulting object, more or
# less, like a numpy-array).
ch1 = s['ADC1']
ch2 = s['ADC2']

# This also works for multiple channels. Note, however, that
# you still slice it as if it was 1d, i.e. if you slice the
# first 10 elements of the object, you will get the first 10
# for BOTH channels.
chs = s[['ADC1', 'ADC2']]
chs[:10] # equivalent to np.array([ch1[:10], ch2[:10]])

# Get ADC data for one channel and slice it (two equivalent ways)
s['ADC1', 10:20]
s['ADC1'][10:20]

# Get voltage data for one channel, slice it, and return the
# voltage values instead of the ADC values. The cleaner way
# to do this would be to use the first syntax above.
s['ADC1', 10:20, 'as_voltage']
get_trace(key: str, where: slice, voltage: bool = True)[source]

Get the ADC trace for a given channel ‘key’ and slice ‘where’. If voltage==True, the ADC value is converted to a voltage (V) fist.

Returns:

ADC or voltage trace.

Return type:

np.ndarray

property keys

Available keys (channel names) in the stream.

Returns:

List of keys.

Return type:

list

property start_us

The microsecond timestamp at which the stream starts.

Returns:

Microsecond timestamp

Return type:

int

property dt_us

The length of a sample in the stream in microseconds.

Returns:

Microsecond time-delta

Return type:

int

property tp_keys

Available testpulse keys in self.tpas and self.tp_timestamps.

Returns:

List of keys.

Return type:

list

property tpas

Dictionary of testpulse amplitudes in the stream. For hardware ‘csmpl’ this is read from a ‘.test_stamps’ file. For hardware ‘vdaq2’ this is obtained from triggering the DAC channels first.

Returns:

Testpulse amplitudes

Return type:

dict of np.ndarray

property tp_timestamps

Dictionary of testpulse timestamps (microseconds) in the stream. For hardware ‘csmpl’ this is read from a ‘.test_stamps’ file. For hardware ‘vdaq2’ this is obtained from triggering the DAC channels first.

Returns:

Testpulse microsecond timestamps.

Return type:

dict of np.ndarray

property calp_keys

Available calpulse keys in self.calpas and self.calp_timestamps.

Returns:

List of keys.

Return type:

list

property calpas

Dictionary of calpulse amplitudes in the stream. For hardware ‘vdaq2’ and ‘vdaq3’ this is obtained from triggering the ADC channels first.

Returns:

Calpulse amplitudes

Return type:

dict of np.ndarray

property calp_timestamps

Dictionary of calpulse timestamps (microseconds) in the stream. For hardware ‘vdaq2’ and ‘vdaq3’ this is obtained from triggering the ADC channels first.

Returns:

Calpulse microsecond timestamps.

Return type:

dict of np.ndarray

class cait.versatile.RDTFile(path: str, path_par: str = None)[source]

Class for interfacing hardware triggered files (file extension .rdt). This class automatically infers the available channels and the available correlated channels. Those can be retrieved by indexing the RDTFile object with channel indices/names or tuples thereof, the result of the indexing is a RDTChannel object which provides testpulse amplitudes, timestamps, and event iterators for (the) selected channel(s) (see documentation for RDTChannel).

Parameters:
  • path (str) – The full path (including the file extension .rdt) to the file of interest.

  • path_par (str, optional) – The full path (including the file extension .par) to the file which contains the necessary parameters to read the .rdt file. If None is given, it is assumed that a .par file with identical name/path as path is available. Defaults to None.

Returns:

Object interfacing an .rdt file.

Return type:

RDTFile

Example:

import cait.versatile as vai

f = vai.RDTFile('path/to/file.rdt')

# Check available channels
print(f.keys)
# Choose channel(s) to iterate over, get testpulse amplitudes, ...,  by slicing RDTFile
channels = f[(0,1)] # if interested in only one channel: channel0 = f[0]
it = channels.get_event_iterator()

# You can now further slice this iterator (like any other iterator in cait.versatile):
it_testpulses = it[:, channels.tpas > 0]
it_events = it[:, channels.tpas == 0]
it_noise = it[:, channels.tpas == -1]

# Have a look (after removing the baseline):
vai.Preview(it_testpulses.with_processing(vai.RemoveBaseline()))
property record_length

The record length (number of samples per event) of the events in the corresponding *.rdt file.

property dt_us

The time base in microseconds (time between two samples) of the events in the corresponding *.rdt file.

property sample_frequency

The sample frequency in Hz of the events in the corresponding *.rdt file.

property measuring_time_h

The total measuring time in hours of the corresponding *.rdt file.

property keys

The channel keys that can be used to index this RDTFile instance. If available, the channel names (corresponding to the indices) are shown as well.

get_trace(inds: int | list, voltage: bool = True)[source]

Return the ADC traces of events in this RDTFile for given indices. If voltage==True, the ADC value is converted to a voltage (V) fist.

Parameters:
  • inds (Union[int, list]) – The indices for which to return the voltage traces.

  • voltage (bool, optional) – If True, voltage values are returned instead of ADC values.

Returns:

Array of as many ADC/voltage traces as given inds.

Return type:

numpy.array

class cait.versatile.MockData(n_events: int = 100, record_length: int = 16384, dt_us: int = 10)[source]

Class to generate quick mock pulse traces (2 channels).

Parameters:
  • n_events (int, optional) – Number of events to simulate. Defaults to 100.

  • record_length (int, optional) – Record length of the pulse traces to simulate. Defaults to 16384.

  • dt_us (int, optional) – Microsecond time base of the pulse traces to simulate. Defaults to 10.

Returns:

Object providing mock data.

Return type:

MockData

get_event_iterator(batch_size: int = None)[source]

Return an event iterator over the events in this mock data instance.

Parameters:

batch_size (int) – The number of events to be returned at once (these are all read together). There will be a trade-off: large batch_sizes cause faster read speed but increase the memory usage.

Returns:

Event iterator

Return type:

MockIterator

get_event(inds: int, channel: slice = None)[source]

Return a single event for a given index.

Parameters:
  • inds (int) – The index of the event that we want to read from the mock data.

  • channel (int) – The channel of the event that we want to read from the mock data. If None, then all channels are returned.

Returns:

Event

Return type:

np.ndarray

property dt_us

The length of a sample in the data in microseconds.

Returns:

Microsecond time-delta

Return type:

int

property start_us

The microsecond timestamp of the start of the recording for this datasource object.