Data Sources

Data sources are objects that make interfacing the various sources of data as easy as possible. Examples are RDTFile for interfacing hardware triggered data files, Stream to access data from a continuous stream file (where it is irrelevant which hardware was used to record the data), but also the object that should be most familiar to cait users, the DataHandler is a data source. Data sources do not have to be files but can also provide simulated data on the fly, like MockData which returns mock voltage traces for quickly testing functions.

The crucial thing is that the technical details on how to access the data behind a data source are hidden from the user. The objects of most interest are voltage traces (“events”) and data sources make it easy to access those voltage traces, no matter how they are stored. A data source provides event iterators.

Top level classes

class cait.versatile.RDTFile(path: str, path_par: Optional[str] = None)[source]

Class for interfacing hardware triggered files (file extension .rdt). This class automatically infers the available channels and the available correlated channels. Those can be retrieved by indexing the RDTFile object with channel indices/names or tuples thereof, the result of the indexing is a RDTChannel object which provides testpulse amplitudes, timestamps, and event iterators for (the) selected channel(s) (see documentation for RDTChannel).

Parameters
  • path (str) – The full path (including the file extension .rdt) to the file of interest.

  • path_par (str, optional) – The full path (including the file extension .par) to the file which contains the necessary parameters to read the .rdt file. If None is given, it is assumed that a .par file with identical name/path as path is available. Defaults to None.

Returns

Object interfacing an .rdt file.

Return type

RDTFile

Example:

import cait.versatile as vai

f = vai.RDTFile('path/to/file.rdt')

# Check available channels
print(f.keys)
# Choose channel(s) to iterate over, get testpulse amplitudes, ...,  by slicing RDTFile
channels = f[(0,1)] # if interested in only one channel: channel0 = f[0]
it = channels.get_event_iterator()

# You can now further slice this iterator (like any other iterator in cait.versatile):
it_testpulses = it[:, channels.tpas > 0]
it_events = it[:, channels.tpas == 0]
it_noise = it[:, channels.tpas == -1]

# Have a look (after removing the baseline):
vai.Preview(it_testpulses.with_processing(vai.RemoveBaseline()))
property record_length

The record length (number of samples per event) of the events in the corresponding *.rdt file.

property dt_us

The time base in microseconds (time between two samples) of the events in the corresponding *.rdt file.

property sample_frequency

The sample frequency in Hz of the events in the corresponding *.rdt file.

property measuring_time_h

The total measuring time in hours of the corresponding *.rdt file.

property keys

The channel keys that can be used to index this RDTFile instance. If available, the channel names (corresponding to the indices) are shown as well.

get_voltage_trace(inds: Union[int, list])[source]

Return the voltage traces of events in this RDTFile for given indices.

Parameters

inds (Union[int, list]) – The indices for which to return the voltage traces.

Returns

Array of as many voltage traces as given inds.

Return type

numpy.array

class cait.versatile.Stream(hardware: str, src: Union[str, List[str]])[source]

Factory class for providing a common access point to stream data. Currently, only vdaq2 and cresst stream files are supported but an extension can be straight forwardly implemented by sub-classing StreamBaseClass and adding it for selection in the constructor of Stream.

The data is accessed by means of slicing (see below). The time property is an object of StreamTime and offers a convenient time interface as well (see below).

Parameters
  • hardware (str) – The hardware which was used to record the stream file. Valid options are [‘cresst’, ‘vdaq2’]

  • src (Union[str, List[str]]) – The source for the stream. Depending on how the data is taken, this can either be the path to one file or a list of paths to multiple files. This input is handled by the specific implementation of the Stream Object. See below for examples.

Usage for different hardware:

CRESST: Files are .csmpl files which contain one channel each. Additionally, we need a .par file to read the start timestamp of the stream data from.

s = Stream(hardware='cresst', src=['par_file.par', 'stream_Ch0.csmpl', 'stream_Ch1.csmpl'])

VDAQ2: Files are .bin files which contain all information necessary to construct the Stream object. It can be input as a single argument.

s = Stream(hardware='vdaq2', src='file.bin')

Usage slicing:

Valid options for slicing streams are the following:

# Get data for one channel
s['ADC1']

# Get data for one channel and slice it (two equivalent ways)
s['ADC1', 10:20]
s['ADC1'][10:20]

# Get data for one channel, slice it, and return the voltage
# values instead of the ADC values
s['ADC1', 10:20, 'as_voltage']
get_voltage_trace(key: str, where: slice)[source]

Get the voltage trace for a given channel ‘key’ and slice ‘where’.

Returns

Voltage trace.

Return type

np.ndarray

property keys

Available keys (channel names) in the stream.

Returns

List of keys.

Return type

list

property start_us

The microsecond timestamp at which the stream starts.

Returns

Microsecond timestamp

Return type

int

property dt_us

The length of a sample in the stream in microseconds.

Returns

Microsecond time-delta

Return type

int

property tpas

Dictionary of testpulse amplitudes in the stream. For hardware ‘cresst’ this is read from a ‘.test_stamps’ file. For hardware ‘vdaq2’ this is obtained from triggering the DAC channels first.

Returns

Testpulse amplitudes

Return type

dict of np.ndarray

property tp_timestamps

Dictionary of testpulse timestamps (microseconds) in the stream. For hardware ‘cresst’ this is read from a ‘.test_stamps’ file. For hardware ‘vdaq2’ this is obtained from triggering the DAC channels first.

Returns

Testpulse microsecond timestamps.

Return type

dict of np.ndarray

class cait.versatile.MockData(n_events: int = 100, record_length: int = 16384, dt_us: int = 10)[source]

Class to generate quick mock pulse traces (2 channels).

Parameters
  • n_events (int, optional) – Number of events to simulate. Defaults to 100.

  • record_length (int, optional) – Record length of the pulse traces to simulate. Defaults to 16384.

  • dt_us (int, optional) – Microsecond time base of the pulse traces to simulate. Defaults to 10.

Returns

Object providing mock data.

Return type

MockData

get_event_iterator(batch_size: Optional[int] = None)[source]

Return an event iterator over the events in this mock data instance.

Parameters

batch_size (int) – The number of events to be returned at once (these are all read together). There will be a trade-off: large batch_sizes cause faster read speed but increase the memory usage.

Returns

Event iterator

Return type

MockIterator

get_event(inds: int, channel: Optional[slice] = None)[source]

Return a single event for a given index.

Parameters
  • inds (int) – The index of the event that we want to read from the mock data.

  • channel (int) – The channel of the event that we want to read from the mock data. If None, then all channels are returned.

Returns

Event

Return type

np.ndarray

property start_us

The microsecond timestamp of the start of the recording for this datasource object.

Related classes and base classes

class cait.versatile.datasources.hardwaretriggered.rdt_file.RDTChannel(rdt_file: cait.versatile.datasources.hardwaretriggered.rdt_file.RDTFile, key: Union[int, tuple])[source]

Object representing a coherent part of an RDTFile (i.e. either a single channel or correlated channels). Usually this is not created as a standalone but the result of slicing an RDTFile.

Parameters
  • rdt_file (RDTFile) – An RDTFile instance.

  • key (Union[int, tuple]) – The key which selects the single channel or correlated channels. Either of rdt_file.keys.

Returns

Specified channels of an RDTFile

Return type

RDTChannel

get_event_iterator(batch_size: Optional[int] = None)[source]

Get an iterator over the events present in this RDTChannel instance.

Parameters

batch_size (int) – The number of events to be returned at once (these are all read together). There will be a trade-off: large batch_sizes cause faster read speed but increase the memory usage.

Returns

Iterable object

Return type

RDTIterator

Example:

import cait.versatile as vai

f = vai.RDTFile('path/to/file.rdt')

# Choose channel(s) to iterate over by slicing RDTFile
channels = f[(0,1)]
it = channels.get_event_iterator()

# You can now further slice this iterator (like any other iterator in cait.versatile):
it_testpulses = it[:, channels.tpas > 0]
it_events = it[:, channels.tpas == 0]
it_noise = it[:, channels.tpas == -1]
# Remove baselines:
it_testpulses.add_processing(vai.RemoveBaseline())

# Have a look:
vai.Preview(it_testpulses)
property key

The RDTFile key that this RDTChannel corresponds to.

property n_channels

The number of channels this RDTChannel corresponds to.

property start_us

The microsecond timestamp of the start of the recording for this datasource object.

property timestamps

The microsecond timestamps of the events in this RDTChannel.

property tpas

The testpulse amplitudes of the events in this RDTChannel.

property unique_tpas

The unique testpulse amplitudes of the events in this RDTChannel.

class cait.versatile.datasources.stream.streambase.StreamBaseClass[source]
abstract property dt_us

The length of a sample in the stream in microseconds.

Returns

Microsecond time-delta

Return type

int

get_event_iterator(keys: Union[str, List[str]], record_length: int, inds: Optional[Union[int, List[int]]] = None, timestamps: Optional[Union[int, List[int]]] = None, alignment: float = 0.25, batch_size: Optional[int] = None)[source]

Returns an iterator object over voltage traces for given trigger indices or timestamps of a stream file.

Parameters
  • keys (Union[str, List[str]]) – The keys (channel names) of the stream object to be iterated over.

  • record_length (int) – The number of samples to be returned for each index. Usually, those are powers of 2, e.g. 16384

  • inds (Union[int, List[int]]) – The stream indices for which we want to read the voltage traces. This index is aligned at 1/4th of the record window. Either inds or timestamps has to be set.

  • timestamps (Union[int, List[int]]) – The stream timestamps for which we want to read the voltage traces. This timestamp is aligned at 1/4th of the record window. Either inds or timestamps has to be set.

  • alignment (float) – A number in the interval [0,1] which determines the alignment of the record window (of length record_length) relative to the specified index. E.g. if alignment=1/2, the record window is centered around the index. Defaults to 1/4.

  • batch_size (int) – The number of events to be returned at once (these are all read together). There will be a trade-off: large batch_sizes cause faster read speed but increase the memory usage.

Returns

Iterable object

Return type

StreamIterator

abstract get_voltage_trace(key: str, where: slice)[source]

Get the voltage trace for a given channel ‘key’ and slice ‘where’.

Returns

Voltage trace.

Return type

np.ndarray

abstract property keys

Available keys (channel names) in the stream.

Returns

List of keys.

Return type

list

abstract property start_us

The microsecond timestamp at which the stream starts.

Returns

Microsecond timestamp

Return type

int

property time

Instance of StreamTime, which can be sliced to convert stream indices into microsecond timestamps and implements utility functions for the conversion to datetime for example.

Returns

StreamTime instance

Return type

StreamTime

abstract property tp_timestamps

Dictionary of testpulse timestamps (microseconds) in the stream. For hardware ‘cresst’ this is read from a ‘.test_stamps’ file. For hardware ‘vdaq2’ this is obtained from triggering the DAC channels first.

Returns

Testpulse microsecond timestamps.

Return type

dict of np.ndarray

abstract property tpas

Dictionary of testpulse amplitudes in the stream. For hardware ‘cresst’ this is read from a ‘.test_stamps’ file. For hardware ‘vdaq2’ this is obtained from triggering the DAC channels first.

Returns

Testpulse amplitudes

Return type

dict of np.ndarray