Utils¶
This page documents the utility components of ICOS-FL, including the data processing, fetching, and visualization utilities.
Fetcher Module¶
- class icos_fl.utils.fetcher.TimeSeriesData(max_rows=300)¶
Class for managing time series data with a sliding window approach.
- Parameters:
max_rows (int, optional) – Maximum number of rows to keep in the DataFrame. Default is 300, which provides enough history for LSTM training and 5-minute prediction with 3-second interval data collection.
- add_dataframe(df)¶
Add new data to the unified dataframe, maintaining the sliding window.
When new data is added, the oldest data points are removed if the total size exceeds max_rows.
- Parameters:
df (pd.DataFrame) – New DataFrame to append
- get_dataframe()¶
Get the current unified DataFrame.
- Returns:
The current DataFrame, or None if no data has been added yet.
- Return type:
Optional[pd.DataFrame]
- wait_for_dataframe()¶
Wait for new data to be added to the DataFrame.
This method blocks until new data is added through add_dataframe().
- Returns:
The updated DataFrame after new data has been added.
- Return type:
pd.DataFrame
- class icos_fl.utils.fetcher.Fetcher(proxy_host='127.0.0.1', dataset='admin')¶
Fetcher for retrieving time series data from DataClay.
This class handles connecting to DataClay, retrieving data through the TimeSeriesData object, and processing it into a format suitable for LSTM model training.
- Parameters:
- fetch_data(timeout=200)¶
Fetch data from DataClay and process it for LSTM training.
Retrieves time series data, converting it to a format suitable for the LSTM model with standardized column names and units.
- Parameters:
timeout (int, optional) – Timeout in seconds for waiting for data
- Returns:
Processed DataFrame ready for model training
- Return type:
pd.DataFrame
- Raises:
TimeoutError – If no data is available within the timeout period
Processor Module¶
- class icos_fl.utils.processor.TimeSeriesDataset(df, start_index, population, time_step, metric, device)¶
Dataset for time series prediction with sliding window approach.
Creates sequences of consecutive time steps as inputs and uses the next value as the prediction target.
- Parameters:
df (pd.DataFrame) – Input DataFrame containing the time series data
start_index (int) – Starting index in the DataFrame to create sequences
population (int) – Total number of samples to include from start_index
time_step (int) – Number of time steps (sequence length) for LSTM input
metric (str) – Column name in the DataFrame to use as target
device (torch.device) – PyTorch device to place tensors on
- __getitem__(index)¶
Get a sequence and its target.
- Parameters:
index (int) – Index of the sequence to retrieve
- Returns:
Tuple of (input_sequence, target_value)
- Return type:
Tuple[torch.Tensor, torch.Tensor]
- class icos_fl.utils.processor.Processor(time_step, metric, batch_size=64, train_ratio=0.8, device=None)¶
Processor for time series data preparation in ICOS-FL.
This class handles data preprocessing for time series forecasting, providing methods for data normalization, sequence creation, and DataLoader generation.
- Parameters:
time_step (int) – Number of time steps (sequence length) for LSTM input
metric (str) – Default column name in the DataFrame to use as target
batch_size (int, optional) – Default batch size for DataLoaders
train_ratio (float, optional) – Default ratio for train/test split
device (Optional[torch.device], optional) – PyTorch device to place tensors on
- create_data_loaders(df, time_step=None, metric=None, batch_size=None, train_ratio=None, device=None)¶
Create DataLoaders for training and validation.
This method handles the complete data preparation pipeline: 1. Normalizes the data 2. Splits into training and validation sets 3. Creates appropriate datasets with sliding window sequences 4. Wraps datasets in DataLoaders
- Parameters:
df (pd.DataFrame) – DataFrame containing the time series data
time_step (Optional[int], optional) – Sequence length (uses instance default if None)
metric (Optional[str], optional) – Column name to use as target (uses instance default if None)
batch_size (Optional[int], optional) – Batch size for DataLoaders (uses instance default if None)
train_ratio (Optional[float], optional) – Ratio of data for training (uses instance default if None)
device (Optional[torch.device], optional) – PyTorch device (uses instance default if None)
- Returns:
Tuple of (train_dataloader, val_dataloader, train_dataset, val_dataset)
- Return type:
Tuple[DataLoader, DataLoader, TimeSeriesDataset, TimeSeriesDataset]
- _normalize_data(df)¶
Normalize the dataset using standardization (zero mean, unit variance).
- Parameters:
df (pd.DataFrame) – Input DataFrame containing the time series data
- Returns:
Normalized DataFrame with the same structure
- Return type:
pd.DataFrame
- _train_test_split(df, train_ratio)¶
Split the dataset into training and testing sets.
Bridge Configuration¶
- class icos_fl.utils.fetcher.ResourceConfiguration(name, rules=None, metric_names=None)¶
Hold the configuration for a resource, including the rules to match it.
The rules will be given in the form of a list of tuples, where each tuple contains the key to match, a function to match the value, and the value to match.
- Parameters:
- add_metric(metric_name)¶
Add a metric to collect for this resource.
- Parameters:
metric_name (str) – Name of the metric to collect
- class icos_fl.utils.fetcher.BridgeConfiguration¶
Aggregate the configuration for the bridge.
This class holds the configuration for the bridge, including the resource configuration objects. It also holds the time-to-live for the dataframes.
- set_res_config(rc)¶
Set a resource configuration.
- Parameters:
rc (ResourceConfiguration) – Resource configuration to set
Utility Classes¶
- class icos_fl.utils.singleton.Singleton¶
A metaclass to make a class a singleton.
Usage:
class MySingletonClass(metaclass=Singleton): ...
Colors Module¶
The icos_fl.utils.colors module provides color constants and utility functions for terminal output:
- icos_fl.utils.colors.paint(color, text, reset=RST)¶
Apply ANSI color codes to text.
Logo Module¶
The icos_fl.utils.logo module provides ASCII art and banner functions:
- icos_fl.utils.logo.print_banner(logo, title='', message='', border_color=BCYA, logo_color=BBLU, title_color=BWHT, message_color=BGRN, ver=None, show_version=True)¶
Displays a customizable banner with logo and optional text.
- Parameters:
logo (str) – The ASCII art logo to display
title (str, optional) – Optional title to display above the logo (default: “”)
message (str, optional) – Optional message to display below the logo (default: “”)
border_color (str, optional) – ANSI color code for the border (default: BCYA)
logo_color (str, optional) – ANSI color code for the logo (default: BBLU)
title_color (str, optional) – ANSI color code for the title (default: BWHT)
message_color (str, optional) – ANSI color code for the message (default: BGRN)
ver (Optional[str], optional) – Optional version string (uses icos_fl.version if None)
show_version (bool, optional) – Whether to display version information (default: True)
Example Usage¶
from icos_fl.utils.fetcher import Fetcher
from icos_fl.utils.processor import Processor
import torch
# Connect to DataClay and fetch data
fetcher = Fetcher(proxy_host="127.0.0.1", dataset="admin")
df = fetcher.fetch_data(timeout=60)
# Process data for LSTM training
processor = Processor(
time_step=10,
metric="cpu_usage",
batch_size=64,
train_ratio=0.8,
device=torch.device("cpu")
)
# Create DataLoaders
train_loader, val_loader, _, _ = processor.create_data_loaders(df)