Data
pamiq_core.data.DataBuffer ¶
Bases: ABC
, PersistentStateMixin
Interface for managing experience data collected during system execution.
DataBuffer provides an interface for collecting and managing experience data generated during system execution. It maintains a buffer of fixed maximum size that stores data for specified data names.
Initializes the DataBuffer.
PARAMETER | DESCRIPTION |
---|---|
collecting_data_names
|
Names of data fields to collect and store.
TYPE:
|
max_size
|
Maximum number of samples to store in the buffer.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
ValueError
|
If max_size is negative. |
Source code in src/pamiq_core/data/buffer.py
collecting_data_names
property
¶
Returns the set of data field names being collected.
add
abstractmethod
¶
Adds a new data sample to the buffer.
PARAMETER | DESCRIPTION |
---|---|
step_data
|
Dictionary containing data for one step. Must contain all fields specified in collecting_data_names.
TYPE:
|
Source code in src/pamiq_core/data/buffer.py
get_data
abstractmethod
¶
Retrieves all stored data from the buffer.
RETURNS | DESCRIPTION |
---|---|
BufferData[T]
|
Dictionary mapping data field names to sequences of their values. |
BufferData[T]
|
Each sequence has the same length. |
__len__
abstractmethod
¶
Returns the current number of samples in the buffer.
RETURNS | DESCRIPTION |
---|---|
int
|
The number of samples currently stored in the buffer.
TYPE:
|
pamiq_core.data.DataCollector ¶
A thread-safe collector for buffered data.
This class provides concurrent data collection capabilities with thread safety, working in conjunction with DataUser to manage data collection and transfer.
Initialize DataCollector with a specified DataUser.
PARAMETER | DESCRIPTION |
---|---|
user
|
DataUser instance this collector is associated with.
TYPE:
|
Source code in src/pamiq_core/data/interface.py
collect ¶
Collect step data in a thread-safe manner.
PARAMETER | DESCRIPTION |
---|---|
step_data
|
Data to be collected.
TYPE:
|
pamiq_core.data.DataUser ¶
Bases: PersistentStateMixin
A class that manages data buffering and timestamps for collected data.
This class acts as a user of data buffers, handling the collection, storage, and retrieval of data along with their timestamps. It works in conjunction with a DataCollector to manage concurrent data collection.
Initialize DataUser with a specified buffer.
PARAMETER | DESCRIPTION |
---|---|
buffer
|
Data buffer instance to store collected data.
TYPE:
|
Source code in src/pamiq_core/data/interface.py
create_empty_queues ¶
Create empty timestamping queues for data collection.
RETURNS | DESCRIPTION |
---|---|
TimestampingQueuesDict[T]
|
New instance of TimestampingQueuesDict with appropriate configuration. |
Source code in src/pamiq_core/data/interface.py
update ¶
Update buffer with collected data from the collector.
Moves all collected data from the collector to the buffer and records their timestamps.
Source code in src/pamiq_core/data/interface.py
get_data ¶
Retrieve data from the buffer.
RETURNS | DESCRIPTION |
---|---|
BufferData[T]
|
Current data stored in the buffer. |
count_data_added_since ¶
Count the number of data points added after the specified timestamp.
NOTE: Use pamiq_core.time
to retrieve timestamp
.
PARAMETER | DESCRIPTION |
---|---|
timestamp
|
Reference timestamp to count from.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
int
|
Number of data points added after the specified timestamp. |
Source code in src/pamiq_core/data/interface.py
save_state ¶
Save the state of this DataUser to the specified path.
This method first updates the buffer with any pending collected data, then delegates the state saving to the underlying buffer.
PARAMETER | DESCRIPTION |
---|---|
path
|
Directory path where the state should be saved
TYPE:
|
Source code in src/pamiq_core/data/interface.py
load_state ¶
Load the state of this DataUser from the specified path.
This method delegates the state loading to the underlying buffer.
PARAMETER | DESCRIPTION |
---|---|
path
|
Directory path from where the state should be loaded
TYPE:
|
Source code in src/pamiq_core/data/interface.py
pamiq_core.data.impls.SequentialBuffer ¶
Bases: DataBuffer[T]
Implementation of DataBuffer that maintains data in sequential order.
This buffer stores collected data points in ordered queues, preserving the insertion order. Each data field is stored in a separate queue with a maximum size limit.
Initialize a new SequentialBuffer.
PARAMETER | DESCRIPTION |
---|---|
collecting_data_names
|
Names of data fields to collect.
TYPE:
|
max_size
|
Maximum number of data points to store.
TYPE:
|
Source code in src/pamiq_core/data/impls/sequential_buffer.py
add ¶
Add a new data sample to the buffer.
PARAMETER | DESCRIPTION |
---|---|
step_data
|
Dictionary containing data for one step. Must contain all fields specified in collecting_data_names.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
KeyError
|
If a required data field is missing from step_data. |
Source code in src/pamiq_core/data/impls/sequential_buffer.py
get_data ¶
Retrieve all stored data from the buffer.
RETURNS | DESCRIPTION |
---|---|
dict[str, list[T]]
|
Dictionary mapping data field names to lists of their values. |
dict[str, list[T]]
|
Each list preserves the original insertion order. |
Source code in src/pamiq_core/data/impls/sequential_buffer.py
__len__ ¶
Returns the current number of samples in the buffer.
RETURNS | DESCRIPTION |
---|---|
int
|
The number of samples currently stored in the buffer.
TYPE:
|
save_state ¶
Save the buffer state to the specified path.
Creates a directory at the given path and saves each data queue as a separate pickle file.
PARAMETER | DESCRIPTION |
---|---|
path
|
Directory path where to save the buffer state
TYPE:
|
Source code in src/pamiq_core/data/impls/sequential_buffer.py
load_state ¶
Load the buffer state from the specified path.
Loads data queues from pickle files in the given directory.
PARAMETER | DESCRIPTION |
---|---|
path
|
Directory path from where to load the buffer state
TYPE:
|
Source code in src/pamiq_core/data/impls/sequential_buffer.py
pamiq_core.data.impls.RandomReplacementBuffer ¶
RandomReplacementBuffer(
collecting_data_names: Iterable[str],
max_size: int,
replace_probability: float | None = None,
expected_survival_length: int | None = None,
)
Bases: DataBuffer[T]
Buffer implementation that randomly replaces elements when full.
This buffer keeps track of collected data and, when full, randomly replaces existing elements based on a configurable probability.
Initialize a RandomReplacementBuffer.
PARAMETER | DESCRIPTION |
---|---|
collecting_data_names
|
Names of data fields to collect.
TYPE:
|
max_size
|
Maximum number of data points to store.
TYPE:
|
replace_probability
|
Probability of replacing an existing element when buffer is full. Must be between 0.0 and 1.0 inclusive. If None and expected_survival_length is provided, this will be computed automatically. Default is 1.0 if both are None.
TYPE:
|
expected_survival_length
|
Expected number of steps that data should survive in the buffer. Used to automatically compute replace_probability if replace_probability is None. Cannot be specified together with replace_probability.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
ValueError
|
If replace_probability is not between 0.0 and 1.0 inclusive, or if both replace_probability and expected_survival_length are specified. |
Source code in src/pamiq_core/data/impls/random_replacement_buffer.py
is_full
property
¶
Check if the buffer has reached its maximum capacity.
RETURNS | DESCRIPTION |
---|---|
bool
|
True if the buffer is full, False otherwise. |
compute_replace_probability_from_expected_survival_length
staticmethod
¶
compute_replace_probability_from_expected_survival_length(
max_size: int, survival_length: int
) -> float
Compute the replace probability from expected survival length.
This method calculates the replacement probability needed to achieve a desired expected survival length for data in the buffer.
The computation is based on the mathematical analysis described in below
https://zenn.dev/gesonanko/scraps/b581e75bfd9f3e
PARAMETER | DESCRIPTION |
---|---|
max_size
|
Maximum size of the buffer.
TYPE:
|
survival_length
|
Expected number of steps that data should survive.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float
|
The computed replacement probability between 0.0 and 1.0. |
Source code in src/pamiq_core/data/impls/random_replacement_buffer.py
add ¶
Add a new data sample to the buffer.
If the buffer is full, the new data may replace an existing entry based on the configured replacement probability.
PARAMETER | DESCRIPTION |
---|---|
step_data
|
Dictionary containing data for one step. Must contain all fields specified in collecting_data_names.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
KeyError
|
If a required data field is missing from step_data. |
Source code in src/pamiq_core/data/impls/random_replacement_buffer.py
get_data ¶
Retrieve all stored data from the buffer.
RETURNS | DESCRIPTION |
---|---|
Mapping[str, list[T]]
|
Dictionary mapping data field names to lists of their values. |
Mapping[str, list[T]]
|
Returns a copy of the internal data to prevent modification. |
Source code in src/pamiq_core/data/impls/random_replacement_buffer.py
__len__ ¶
Returns the current number of samples in the buffer.
RETURNS | DESCRIPTION |
---|---|
int
|
The number of samples currently stored in the buffer.
TYPE:
|
save_state ¶
Save the buffer state to the specified path.
Creates a directory at the given path and saves each data list as a separate pickle file.
PARAMETER | DESCRIPTION |
---|---|
path
|
Directory path where to save the buffer state.
TYPE:
|
Source code in src/pamiq_core/data/impls/random_replacement_buffer.py
load_state ¶
Load the buffer state from the specified path.
Loads data lists from pickle files in the given directory.
PARAMETER | DESCRIPTION |
---|---|
path
|
Directory path from where to load the buffer state.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
ValueError
|
If loaded data lists have inconsistent lengths. |