Utils

This page documents the utility components of ICOS-FL, including the data processing, fetching, and visualization utilities.

Fetcher Module

class icos_fl.utils.fetcher.TimeSeriesData(max_rows=300)

Class for managing time series data with a sliding window approach.

Parameters:

max_rows (int, optional) – Maximum number of rows to keep in the DataFrame. Default is 300, which provides enough history for LSTM training and 5-minute prediction with 3-second interval data collection.

add_dataframe(df)

Add new data to the unified dataframe, maintaining the sliding window.

When new data is added, the oldest data points are removed if the total size exceeds max_rows.

Parameters:

df (pd.DataFrame) – New DataFrame to append

get_dataframe()

Get the current unified DataFrame.

Returns:

The current DataFrame, or None if no data has been added yet.

Return type:

Optional[pd.DataFrame]

wait_for_dataframe()

Wait for new data to be added to the DataFrame.

This method blocks until new data is added through add_dataframe().

Returns:

The updated DataFrame after new data has been added.

Return type:

pd.DataFrame

class icos_fl.utils.fetcher.Fetcher(proxy_host='127.0.0.1', dataset='admin')

Fetcher for retrieving time series data from DataClay.

This class handles connecting to DataClay, retrieving data through the TimeSeriesData object, and processing it into a format suitable for LSTM model training.

Parameters:
  • proxy_host (str, optional) – Host address for the DataClay proxy

  • dataset (str, optional) – Dataset name to connect to

fetch_data(timeout=200)

Fetch data from DataClay and process it for LSTM training.

Retrieves time series data, converting it to a format suitable for the LSTM model with standardized column names and units.

Parameters:

timeout (int, optional) – Timeout in seconds for waiting for data

Returns:

Processed DataFrame ready for model training

Return type:

pd.DataFrame

Raises:

TimeoutError – If no data is available within the timeout period

Processor Module

class icos_fl.utils.processor.TimeSeriesDataset(df, start_index, population, time_step, metric, device)

Dataset for time series prediction with sliding window approach.

Creates sequences of consecutive time steps as inputs and uses the next value as the prediction target.

Parameters:
  • df (pd.DataFrame) – Input DataFrame containing the time series data

  • start_index (int) – Starting index in the DataFrame to create sequences

  • population (int) – Total number of samples to include from start_index

  • time_step (int) – Number of time steps (sequence length) for LSTM input

  • metric (str) – Column name in the DataFrame to use as target

  • device (torch.device) – PyTorch device to place tensors on

__getitem__(index)

Get a sequence and its target.

Parameters:

index (int) – Index of the sequence to retrieve

Returns:

Tuple of (input_sequence, target_value)

Return type:

Tuple[torch.Tensor, torch.Tensor]

__len__()

Return the number of sequences in the dataset.

Returns:

Number of sequences

Return type:

int

class icos_fl.utils.processor.Processor(time_step, metric, batch_size=64, train_ratio=0.8, device=None)

Processor for time series data preparation in ICOS-FL.

This class handles data preprocessing for time series forecasting, providing methods for data normalization, sequence creation, and DataLoader generation.

Parameters:
  • time_step (int) – Number of time steps (sequence length) for LSTM input

  • metric (str) – Default column name in the DataFrame to use as target

  • batch_size (int, optional) – Default batch size for DataLoaders

  • train_ratio (float, optional) – Default ratio for train/test split

  • device (Optional[torch.device], optional) – PyTorch device to place tensors on

create_data_loaders(df, time_step=None, metric=None, batch_size=None, train_ratio=None, device=None)

Create DataLoaders for training and validation.

This method handles the complete data preparation pipeline: 1. Normalizes the data 2. Splits into training and validation sets 3. Creates appropriate datasets with sliding window sequences 4. Wraps datasets in DataLoaders

Parameters:
  • df (pd.DataFrame) – DataFrame containing the time series data

  • time_step (Optional[int], optional) – Sequence length (uses instance default if None)

  • metric (Optional[str], optional) – Column name to use as target (uses instance default if None)

  • batch_size (Optional[int], optional) – Batch size for DataLoaders (uses instance default if None)

  • train_ratio (Optional[float], optional) – Ratio of data for training (uses instance default if None)

  • device (Optional[torch.device], optional) – PyTorch device (uses instance default if None)

Returns:

Tuple of (train_dataloader, val_dataloader, train_dataset, val_dataset)

Return type:

Tuple[DataLoader, DataLoader, TimeSeriesDataset, TimeSeriesDataset]

_normalize_data(df)

Normalize the dataset using standardization (zero mean, unit variance).

Parameters:

df (pd.DataFrame) – Input DataFrame containing the time series data

Returns:

Normalized DataFrame with the same structure

Return type:

pd.DataFrame

_train_test_split(df, train_ratio)

Split the dataset into training and testing sets.

Parameters:
  • df (pd.DataFrame) – Input DataFrame containing the time series data

  • train_ratio (float) – Ratio for splitting data into train and test sets (0-1)

Returns:

Tuple containing the number of training and testing samples

Return type:

Tuple[int, int]

Bridge Configuration

class icos_fl.utils.fetcher.ResourceConfiguration(name, rules=None, metric_names=None)

Hold the configuration for a resource, including the rules to match it.

The rules will be given in the form of a list of tuples, where each tuple contains the key to match, a function to match the value, and the value to match.

Parameters:
  • name (str) – Name of the resource configuration

  • rules (Optional[list[MatchRule]], optional) – List of match rules for the resource, defaults to None

  • metric_names (Optional[set[str]], optional) – Set of metric names to collect, defaults to None

add_metric(metric_name)

Add a metric to collect for this resource.

Parameters:

metric_name (str) – Name of the metric to collect

remove_metric(metric_name)

Remove a metric from collection.

Parameters:

metric_name (str) – Name of the metric to remove

match(resource_kvs)

Check if this configuration matches the given resource key-value pairs.

Parameters:

resource_kvs (dict[str, str]) – Dictionary of resource key-value pairs

Returns:

True if matches, False otherwise

Return type:

bool

class icos_fl.utils.fetcher.BridgeConfiguration

Aggregate the configuration for the bridge.

This class holds the configuration for the bridge, including the resource configuration objects. It also holds the time-to-live for the dataframes.

set_res_config(rc)

Set a resource configuration.

Parameters:

rc (ResourceConfiguration) – Resource configuration to set

remove_res_config(name)

Remove a resource configuration by name.

Parameters:

name (str) – Name of the resource configuration to remove

get_matching_res_configs(resource_kvs)

Get all resource configurations that match the given resource.

Parameters:

resource_kvs (dict[str, str]) – Dictionary of resource key-value pairs

Returns:

List of matching resource configurations

Return type:

list[ResourceConfiguration]

Utility Classes

class icos_fl.utils.singleton.Singleton

A metaclass to make a class a singleton.

Usage:

class MySingletonClass(metaclass=Singleton):
    ...

Colors Module

The icos_fl.utils.colors module provides color constants and utility functions for terminal output:

icos_fl.utils.colors.paint(color, text, reset=RST)

Apply ANSI color codes to text.

Parameters:
  • color (str) – ANSI color code to apply

  • text (str) – Text string to color

  • reset (str, optional) – ANSI code to apply after text (default: RST)

Returns:

Colored text string with reset code appended

Return type:

str

Logo Module

The icos_fl.utils.logo module provides ASCII art and banner functions:

icos_fl.utils.logo.print_banner(logo, title='', message='', border_color=BCYA, logo_color=BBLU, title_color=BWHT, message_color=BGRN, ver=None, show_version=True)

Displays a customizable banner with logo and optional text.

Parameters:
  • logo (str) – The ASCII art logo to display

  • title (str, optional) – Optional title to display above the logo (default: “”)

  • message (str, optional) – Optional message to display below the logo (default: “”)

  • border_color (str, optional) – ANSI color code for the border (default: BCYA)

  • logo_color (str, optional) – ANSI color code for the logo (default: BBLU)

  • title_color (str, optional) – ANSI color code for the title (default: BWHT)

  • message_color (str, optional) – ANSI color code for the message (default: BGRN)

  • ver (Optional[str], optional) – Optional version string (uses icos_fl.version if None)

  • show_version (bool, optional) – Whether to display version information (default: True)

Example Usage

from icos_fl.utils.fetcher import Fetcher
from icos_fl.utils.processor import Processor
import torch

# Connect to DataClay and fetch data
fetcher = Fetcher(proxy_host="127.0.0.1", dataset="admin")
df = fetcher.fetch_data(timeout=60)

# Process data for LSTM training
processor = Processor(
    time_step=10,
    metric="cpu_usage",
    batch_size=64,
    train_ratio=0.8,
    device=torch.device("cpu")
)

# Create DataLoaders
train_loader, val_loader, _, _ = processor.create_data_loaders(df)