transform

This module provides base classes for transforms.

Pre-processing of data, feature engineering, or augmentation, are fundamental processes in machine learning. AGenC generalizes these processes under the term transforms. This page will guide through the concept of transforms and demonstrate how to use them within AGenC.

Nomenclature

Transforms are a set of operations that modify data. They can include operations such as data normalization, dimensionality reduction, data augmentation, and much more. These transformations are essential for preparing data for machine learning tasks and improving model performance.

In AGenC, we use the generalized term transform for all types of pre-processing of data, feature engineering, and data augmentation, as they all involve the same fundamental concept of transforming data to obtain a modified dataset.

AGenC provides a flexible and unified interface to apply transforms to data. The framework allows to combine these transforming steps steps as needed.

Using Transforms

Here's a basic example:

from flowcean.transforms import Select, Standardize

# Load the dataset
dataset = ...

# Define transforms by chaining a selection and a standardization
transforms = Select(features=["reference", "temperature"]) | Standardize(
    mean={
        "reference": 0.0,
        "temperature": 0.0,
    },
    std={
        "reference": 1.0,
        "temperature": 1.0,
    },
)

# Apply the transforms to data
transformed_data = transforms(dataset)

`Transform`

Bases: ABC

Base class for all transforms.

`apply(data)` `abstractmethod`

Apply the transform to data.

Parameters:

Name	Type	Description	Default
`data`	`Data`	The data to transform.	required

Returns:

Type	Description
`Data`	The transformed data.

Source code in src/flowcean/core/transform.py

@abstractmethod
def apply(self, data: Data) -> Data:
    """Apply the transform to data.

    Args:
        data: The data to transform.

    Returns:
        The transformed data.
    """

`call(data)`

Apply the transform to data.

Parameters:

Name	Type	Description	Default
`data`	`Data`	The data to transform.	required

Returns:

Type	Description
`Data`	The transformed data.

Source code in src/flowcean/core/transform.py

def __call__(self, data: Data) -> Data:
    """Apply the transform to data.

    Args:
        data: The data to transform.

    Returns:
        The transformed data.
    """
    return self.apply(data)

`chain(other)`

Chain this transform with other transforms.

This can be used to chain multiple transforms together. Chained transforms are applied left to right.

Example

chained_transform = TransformA().chain(TransformB())

Parameters:

Name	Type	Description	Default
`other`	`Transform`	The transforms to chain.	required

Returns:

Type	Description
`Transform`	A new Chain transform.

Source code in src/flowcean/core/transform.py

def chain(
    self,
    other: Transform,
) -> Transform:
    """Chain this transform with other transforms.

    This can be used to chain multiple transforms together.
    Chained transforms are applied left to right.

    Example:
        ```python
        chained_transform = TransformA().chain(TransformB())
        ```

    Args:
        other: The transforms to chain.

    Returns:
        A new Chain transform.
    """
    return ChainedTransforms(self, other)

`or(other)`

Shorthand for chaining transforms.

Example

chained_transform = TransformA() | TransformB()

Parameters:

Name	Type	Description	Default
`other`	`Transform`	The transform to chain.	required

Returns:

Type	Description
`Transform`	A new Chain transform.

Source code in src/flowcean/core/transform.py

def __or__(
    self,
    other: Transform,
) -> Transform:
    """Shorthand for chaining transforms.

    Example:
        ```python
        chained_transform = TransformA() | TransformB()
        ```

    Args:
        other: The transform to chain.

    Returns:
        A new Chain transform.
    """
    return self.chain(other)

`inverse()`

Get the inverse of the transform.

Returns:

Type	Description
`Transform`	The inverse of the transform.

Source code in src/flowcean/core/transform.py

def inverse(self) -> Transform:
    """Get the inverse of the transform.

    Returns:
        The inverse of the transform.
    """
    raise NotImplementedError

`FitOnce`

Bases: ABC

A mixin for transforms that need to be fitted to data once.

`fit(data)` `abstractmethod`

Fit to the data.

Parameters:

Name	Type	Description	Default
`data`	`Data`	The data to fit to.	required

Source code in src/flowcean/core/transform.py

@abstractmethod
def fit(self, data: Data) -> None:
    """Fit to the data.

    Args:
        data: The data to fit to.
    """

`FitIncremetally`

Bases: ABC

A mixin for transforms that need to be fitted to data incrementally.

`fit_incremental(data)` `abstractmethod`

Fit to the data incrementally.

Parameters:

Name	Type	Description	Default
`data`	`Data`	The data to fit to.	required

Source code in src/flowcean/core/transform.py

@abstractmethod
def fit_incremental(self, data: Data) -> None:
    """Fit to the data incrementally.

    Args:
        data: The data to fit to.
    """

`ChainedTransforms(*transforms)`

Bases: Transform, FitOnce, FitIncremetally

A transform that is a chain of other transforms.

Initialize the chained transforms.

Parameters:

Name	Type	Description	Default
`transforms`	`Transform`	The transforms to chain.	`()`

Source code in src/flowcean/core/transform.py

def __init__(
    self,
    *transforms: Transform,
) -> None:
    """Initialize the chained transforms.

    Args:
        transforms: The transforms to chain.
    """
    self.transforms = transforms

`Identity()`

Bases: Transform

A transform that does nothing.

Initialize the identity transform.

Source code in src/flowcean/core/transform.py

def __init__(self) -> None:
    """Initialize the identity transform."""
    super().__init__()

transform

Nomenclature

Using Transforms

Transform

apply(data) abstractmethod

__call__(data)

chain(other)

__or__(other)

inverse()

FitOnce

fit(data) abstractmethod

FitIncremetally

fit_incremental(data) abstractmethod

ChainedTransforms(*transforms)

Identity()

`Transform`

`apply(data)` `abstractmethod`

`call(data)`

`chain(other)`

`or(other)`

`inverse()`

`FitOnce`

`fit(data)` `abstractmethod`

`FitIncremetally`

`fit_incremental(data)` `abstractmethod`

`ChainedTransforms(*transforms)`

`Identity()`