Skip to content

cluster

Cluster(clusterer, *, cluster_feature_name='cluster_label', features=None)

Bases: Transform

Cluster data using a clustering algorithm.

This transform allows to cluster data using a specified clustering algorithm. The resulting cluster label is added as a new feature to the DataFrame.

Initializes the Cluster transform.

Parameters:

Name Type Description Default
clusterer Clusterer

The clustering algorithm to use.

required
cluster_feature_name str

The name of the feature to store the cluster labels.

'cluster_label'
features Iterable[str] | None

The features to use for clustering. If None, all features are used.

None
Source code in src/flowcean/polars/transforms/cluster.py
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
def __init__(
    self,
    clusterer: Clusterer,
    *,
    cluster_feature_name: str = "cluster_label",
    features: Iterable[str] | None = None,
) -> None:
    """Initializes the Cluster transform.

    Args:
        clusterer: The clustering algorithm to use.
        cluster_feature_name: The name of the feature to store the cluster
            labels.
        features: The features to use for clustering. If None, all features
            are used.
    """
    self.clusterer = clusterer
    self.cluster_feature_name = cluster_feature_name
    self.features = features