LArPix+HDF5 Format¶
This module gives access to the LArPix+HDF5 file format.
File format description¶
All LArPix+HDF5 files use the HDF5 format so that they can be read and written using any language that has an HDF5 binding. The documentation for the Python h5py binding is at <http://docs.h5py.org>.
The to_file
and from_file
methods translate between a list of
Packet-like objects and an HDF5 data file. from_file
can be used to
load up the full file all at once or just a subset of rows (supposing
the full file was too big to fit in memory). To access the data most
efficiently, do not rely on from_file
and instead perform analysis
directly on the HDF5 data file.
File Header¶
The file header can be found in the /_header
HDF5 group. At a
minimum, the header will contain the following HDF5 attributes:
version
: a string containing the LArPix+HDF5 versioncreated
: a Unix timestamp of the file’s creation timemodified
: a Unix timestamp of the file’s last-modified time
Versions¶
The LArPix+HDF5 format is self-describing and versioned. This means as
the format evolves, the files themselves will identify which version
of the format should be used to interpret them. When writing a file
with to_file
, the format version can be specified, or by default,
the latest version is used. When reading a file with from_file
, by
default, the format version of the actual file is used. If a specific
format version is expected or required, that version can be specified,
and a RuntimeError
will be raised if a different format version is
encountered.
The versions are always in the format major.minor
and are stored as
strings (e.g. '1.0'
, '1.5'
, 2.0
).
The minor format will increase if a non-breaking change is made, so that a script compatible with a lower minor version will also work with files that have a higher minor version. E.g. a script designed to work with v1.0 will also work with v1.5. The reverse is not necessarily true: a script designed to work with v1.5 may not work with v1.0 files.
The major format will increase if a breaking change is made. This means that a script designed to work with v1.5 will likely not work with v2.0 files, and vice versa.
File Data¶
The file data is saved in HDF5 datasets, and the specific data format depends on the LArPix+HDF5 version.
Version 2.2 description¶
For version 2.2, there are two dataset: packets
and messages
.
The packets
dataset
contains a list of all of the packets sent and received during a
particular time interval.
Shape:
(N,)
,N >= 0
Datatype: a compound datatype (called “structured type” in h5py/numpy). Not all fields are relevant for each packet. Unused fields are set to a default value of 0 or the empty string. Keys/fields:
io_group
(u1
/unsigned byte): an id associated with the high-level io group associated with this packetio_channel
(u1
/unsigned byte): the id associated with the mid-level io channel associated with this packetpacket_type
(u1
/unsigned byte): the packet type code, which can be interpreted according to the map stored in the ‘packets’ attribute ‘packet_types’chip_id
(u1
/unsigned byte): the LArPix chip idparity
(u1
/unsigned byte): the packet parity bit (0 or 1)valid_parity
(u1
/unsigned byte): 1 if the packet parity is valid (odd), 0 if it is invaliddownstream_marker
(u1
/unsigned byte): a marker to indicate the hydra io network direction for this packetchannel_id
(u1
/unsigned byte): the ASIC channeltimestamp
(u8
/unsigned 8-byte long int): the timestamp associated with the packet. Caution: this field does “many-duty” as both the ASIC timestamp in data packets (type
== 0), as the global timestamp in timestamp packets (type
== 4), as the message timestamp in message packets (type
== 5), as the timestamp in sync packets (type
== 6), and the timestamp in trigger packets (type
== 7).first_packet
(u1
/unsigned byte): indicates if this is the packet recieved in a trigger burst (v2.1 or newer only)dataword
(u1
/unsigned byte): the ADC data word Caution: as of v2.2, this field does double duty as both the LArPix ADC value in data packets (type
== 0) and the clk source value in sync packets (type
== 6).trigger_type
(u1
/unsigned byte): the trigger type assciated with this packet Caution: as of v2.2, this field does triple duty as both the LArPix packet trigger type in data packets (type
== 0), the sync type in sync packets (type
== 6), and the trigger type in trigger packets (type
== 7).local_fifo` (``u1
/unsigned byte): 1 if the channel FIFO is >50% full, 3 if the channel FIFO is 100% fullshared_fifo
(u1
/unsigned byte): 1 if the chip FIFO is >50% full, 3 if the channel FIFO is 100% fullregister_address
(u1
/unsigned byte): the configuration register indexregister_data
(u1
/unsigned byte): the configuration register valuedirection
(u1
/unsigned byte): 0 if packet was sent to ASICs, 1 if packet was received from ASICs.local_fifo_events
(u1
/unsigned byte): number of packets in the channel FIFO (only valid if FIFO diagnostics are enabled)shared_fifo_events
(u2
/unsigned byte): number of packets in the chip FIFO (only valid if FIFO diagnostics are enabled)counter
(u4
/unsigned 4-byte int): the message index (only valid for message type packets)fifo_diagnostics_enabled
(u1
/unsigned byte): flag for when fifo diagnostics are enabled (1 if enabled, 0 if not)Packet types lookup: the
packets
dataset has an attribute'packet_types'
which contains the following lookup table for packets:0: 'data', 1: 'test', 2: 'config write', 3: 'config read', 4: 'timestamp', 5: 'message', 6: 'sync', 7: 'trigger'
The messages
dataset has the full messages referred to by message
packets in the packets
dataset.
Shape:
(N,)
,N >= 0
Datatype: a compound datatype with fields:
message
(S64
/64-character string): the messagetimestamp
(u8
/unsigned 8-byte long int): the timestamp associated with the messageindex
(u4
/unsigned 4-byte int): the message index, which should be equal to the row index in themessages
dataset
Version 2.1 description¶
For version 2.1, there are two dataset: packets
and messages
.
The packets
dataset
contains a list of all of the packets sent and received during a
particular time interval.
Shape:
(N,)
,N >= 0
Datatype: a compound datatype (called “structured type” in h5py/numpy). Not all fields are relevant for each packet. Unused fields are set to a default value of 0 or the empty string. Keys/fields:
io_group
(u1
/unsigned byte): an id associated with the high-level io group associated with this packetio_channel
(u1
/unsigned byte): the id associated with the mid-level io channel associated with this packetpacket_type
(u1
/unsigned byte): the packet type code, which can be interpreted according to the map stored in the ‘packets’ attribute ‘packet_types’chip_id
(u1
/unsigned byte): the LArPix chip idparity
(u1
/unsigned byte): the packet parity bit (0 or 1)valid_parity
(u1
/unsigned byte): 1 if the packet parity is valid (odd), 0 if it is invaliddownstream_marker
(u1
/unsigned byte): a marker to indicate the hydra io network direction for this packetchannel_id
(u1
/unsigned byte): the ASIC channeltimestamp
(u8
/unsigned 8-byte long int): the timestamp associated with the packet. Caution: this field does “triple duty” as both the ASIC timestamp in data packets (type
== 0), as the global timestamp in timestamp packets (type
== 4), and as the message timestamp in message packets (type
== 5).first_packet
(u1
/unsigned byte): indicates if this is the packet recieved in a trigger burst (v2.1 or newer only)dataword
(u1
/unsigned byte): the ADC data wordtrigger_type
(u1
/unsigned byte): the trigger type assciated with this packetlocal_fifo` (``u1
/unsigned byte): 1 if the channel FIFO is >50% full, 3 if the channel FIFO is 100% fullshared_fifo
(u1
/unsigned byte): 1 if the chip FIFO is >50% full, 3 if the channel FIFO is 100% fullregister_address
(u1
/unsigned byte): the configuration register indexregister_data
(u1
/unsigned byte): the configuration register valuedirection
(u1
/unsigned byte): 0 if packet was sent to ASICs, 1 if packet was received from ASICs.local_fifo_events
(u1
/unsigned byte): number of packets in the channel FIFO (only valid if FIFO diagnostics are enabled)shared_fifo_events
(u2
/unsigned byte): number of packets in the chip FIFO (only valid if FIFO diagnostics are enabled)counter
(u4
/unsigned 4-byte int): the message index (only valid for message type packets)fifo_diagnostics_enabled
(u1
/unsigned byte): flag for when fifo diagnostics are enabled (1 if enabled, 0 if not)Packet types lookup: the
packets
dataset has an attribute'packet_types'
which contains the following lookup table for packets:0: 'data', 1: 'test', 2: 'config write', 3: 'config read', 4: 'timestamp', 5: 'message',
The messages
dataset has the full messages referred to by message
packets in the packets
dataset.
Shape:
(N,)
,N >= 0
Datatype: a compound datatype with fields:
message
(S64
/64-character string): the messagetimestamp
(u8
/unsigned 8-byte long int): the timestamp associated with the messageindex
(u4
/unsigned 4-byte int): the message index, which should be equal to the row index in themessages
dataset
Version 1.0 description¶
For version 1.0, there are two dataset: packets
and messages
.
The packets
dataset
contains a list of all of the packets sent and received during a
particular time interval.
Shape:
(N,)
,N >= 0
Datatype: a compound datatype (called “structured type” in h5py/numpy). Not all fields are relevant for each packet. Unused fields are set to a default value of 0 or the empty string. Keys/fields:
chip_key
(S32
/32-character string): the chip key identifying the ASIC associated with this packettype
(u1
/unsigned byte): the packet type code, which can be interpreted according to the map stored in the raw_packet attribute ‘packet_types’chipid
(u1
/unsigned byte): the LArPix chipidparity
(u1
/unsigned byte): the packet parity bit (0 or 1)valid_parity
(u1
/unsigned byte): 1 if the packet parity is valid (odd), 0 if it is invalidchannel
(u1
/unsigned byte): the ASIC channeltimestamp
(u8
/unsigned 8-byte long int): the timestamp associated with the packet. Caution: this field does “triple duty” as both the ASIC timestamp in data packets (type
== 0), as the global timestamp in timestamp packets (type
== 4), and as the message timestamp in message packets (type
== 5).adc_counts
(u1
/unsigned byte): the ADC data wordfifo_half
(u1
/unsigned byte): 1 if the FIFO half full flag is present, 0 otherwise.fifo_full
(u1
/unsigned byte): 1 if the FIFO full flag is present, 0 otherwise.register
(u1
/unsigned byte): the configuration register indexvalue
(u1
/unsigned byte): the configuration register valuecounter
(u4
/unsigned 4-byte int): the test counter value, or the message index. Caution: this field does “double duty” as the counter for test packets (type
== 1) and as the message index for message packets (type
== 5).direction
(u1
/unsigned byte): 0 if packet was sent to ASICs, 1 if packet was received from ASICs.Packet types lookup: the
packets
dataset has an attribute'packet_types'
which contains the following lookup table for packets:0: 'data', 1: 'test', 2: 'config write', 3: 'config read', 4: 'timestamp', 5: 'message',
The messages
dataset has the full messages referred to by message
packets in the packets
dataset.
Shape:
(N,)
,N >= 0
Datatype: a compound datatype with fields:
message
(S64
/64-character string): the messagetimestamp
(u8
/unsigned 8-byte long int): the timestamp associated with the messageindex
(u4
/unsigned 4-byte int): the message index, which should be equal to the row index in themessages
dataset
Examples¶
Plot a histogram of ADC counts (selecting packet type to be data packets only)
>>> import matplotlib.pyplot as plt
>>> import h5py
>>> f = h5py.File('output.h5', 'r')
>>> packets = f['packets']
>>> plt.hist(packets['adc_counts'][packets['type'] == 0])
>>> plt.show()
Load the first 10 packets in a file into Packet objects and print any MessagePacket packets to the console
>>> from larpix.format.hdf5format import from_file
>>> from larpix.larpix import MessagePacket
>>> result = from_file('output.h5', end=10)
>>> for packet in result['packets']:
... if isinstance(packet, MessagePacket):
... print(packet)
-
larpix.format.hdf5format.
latest_version
= '2.3'¶ The most recent / up-to-date LArPix+HDF5 format version
-
larpix.format.hdf5format.
dtypes
¶ The dtype specification used in the HDF5 files.
Structure:
{version: {dset_name: [structured dtype fields]}}
-
larpix.format.hdf5format.
dtype_property_index_lookup
¶ A map between attribute name and “column index” in the structured dtypes.
Structure:
{version: {dset_name: {field_name: index}}}
-
larpix.format.hdf5format.
to_file
(filename, packet_list, mode='a', version=None)[source]¶ Save the given packets to the given file.
This method can be used to update an existing file.
Parameters: - filename – the name of the file to save to
- packet_list – any iterable of objects of type
Packet
orTimestampPacket
. - mode – optional, the “file mode” to open the data file
(default:
'a'
) - version – optional, the LArPix+HDF5 format version to use. If
writing a new file and version is unspecified or
None
, the latest version will be used. If writing an existing file and version is unspecified orNone
, the existing file’s version will be used. If writing an existing file and version is specified and does not exactly match the existing file’s version, aRuntimeError
will be raised. (default:None
)
-
larpix.format.hdf5format.
from_file
(filename, version=None, start=None, end=None)[source]¶ Read the data from the given file into LArPix Packet objects.
Parameters: - filename – the name of the file to read
- version – the format version. Specify this parameter to
enforce a version check. When a specific version such as
'1.5'
is specified, aRuntimeError
will be raised if the stored format version number is not an exact match. If a version is prefixed with'~'
such as'~1.5'
, aRuntimeError
will be raised if the stored format version is incompatible with the specified version. Compatible versions are those with the same major version and at least the same minor version. E.g. for'~1.5'
, versions between v1.5 and v2.0 are compatible. If unspecified orNone
, will use the stored format version. - start – the index of the first row to read
- end – the index after the last row to read (same semantics as
Python
range
)
Returns packet_dict: a dict with keys
'packets'
containing a list of packet objects; and'created'
,'modified'
, and'version'
, containing the file metadata.