Skip to content

application

ModelHandler(model)

Load a Flowcean model and expose its underlying ML model.

Attributes:

model: flowcean.core.model.Model The loaded Flowcean model.

Methods:

get_ml_model() Returns the underlying machine learning model from the Flowcean model.

get_model_prediction() Returns predictions from the Flowcean model as a LazyFrame.

get_model_prediction_as_lst() Returns predictions from the Flowcean model as a Python list.

Initializes the ModelHandler.

Parameters:

Name Type Description Default
model Model

Flowcean model instance.

required
Source code in src/flowcean/testing/generator/ddtig/application/model_handler.py
34
35
36
37
38
39
40
41
42
43
44
def __init__(
    self,
    model: Model,
) -> None:
    """Initializes the ModelHandler.

    Args:
        model: Flowcean model instance.
    """
    # Load the Flowcean model from the given file
    self.model = model

get_ml_model()

Extract the underlying machine learning model.

Returns:

Type Description
SupportsPredict | Module

The machine learning model.

Source code in src/flowcean/testing/generator/ddtig/application/model_handler.py
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
def get_ml_model(self) -> SupportsPredict | Module:
    """Extract the underlying machine learning model.

    Returns:
        The machine learning model.
    """
    if type(self.model) is SciKitModel:
        ml_model = self.model.estimator
    elif type(self.model) is PyTorchModel:
        ml_model = self.model.module
    else:
        msg = f"Unsupported model type: {type(self.model)}"
        raise ValueError(msg)
    logger.info("Extracted underlying ML model successfully.")
    return ml_model

get_model_prediction(input_features)

Generates predictions using the Flowcean model.

Parameters:

Name Type Description Default
input_features DataFrame

A Polars DataFrame containing input features.

required

Returns:

Type Description
LazyFrame

A LazyFrame with predicted outputs.

Source code in src/flowcean/testing/generator/ddtig/application/model_handler.py
62
63
64
65
66
67
68
69
70
71
72
73
74
def get_model_prediction(
    self,
    input_features: pl.DataFrame,
) -> pl.LazyFrame:
    """Generates predictions using the Flowcean model.

    Args:
        input_features: A Polars DataFrame containing input features.

    Returns:
        A LazyFrame with predicted outputs.
    """
    return self.model.predict(input_features.lazy())

get_model_prediction_as_lst(input_features)

Generate predictions and return them as a Python list.

Parameters:

Name Type Description Default
input_features DataFrame

A Polars DataFrame containing input features.

required

Returns:

Type Description
list

A list of predicted output values.

Source code in src/flowcean/testing/generator/ddtig/application/model_handler.py
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
def get_model_prediction_as_lst(
    self,
    input_features: pl.DataFrame,
) -> list:
    """Generate predictions and return them as a Python list.

    Args:
        input_features: A Polars DataFrame containing input features.

    Returns:
        A list of predicted output values.
    """
    pred_df = self.model.predict(input_features.lazy()).collect()
    target_name = pred_df.columns[-1]
    return pred_df[target_name].to_list()

TestPipeline(model, n_testinputs, test_coverage_criterium, dataset=None, specs_file=None, *, classification=False, inverse_alloc=False, epsilon=0.5, seed=42, performance_threshold=0.3, sample_limit=50000, n_predictions=50, max_depth=5, hoeffding_tree_extra_params=None)

Workflow for test input generation in Flowcean.

Attributes:

model_handler: ModelHandler Handles the Flowcean model and its predictions.

Decision Tree | Black-box Model

Underlying machine learning model extracted from Flowcean model.

pl.DataFrame

The original training dataset.

SystemSpecsHandler

Extracts system specifications and feature information.

dict

Test requirements provided by the user.

bool

Indicates whether the task is classification.

list

List of all equivalence classes.

list

List of all test plans (intervals used to sample test inputs).

list

List of all generated test inputs.

list

Number of test inputs to generate per equivalence class.

pl.DataFrame

Executable test inputs formatted for Flowcean.

list

Names of all input features.

HoeffdingTreeRegressor

Hoeffding tree used to approximate complex black-box models.

Methods:

execute() Executes the full test input generation workflow.

save_hoeffding_tree() Saves the generated Hoeffding tree to a file.

save_test_overview() Saves intermediate results and generated outputs.

Initializes the TestPipeline.

Parameters:

Name Type Description Default
model Model

The trained Flowcean model.

required
n_testinputs int

Total number of test inputs to generate.

required
test_coverage_criterium str

Coverage strategy, either bva or dtc.

required
dataset DataFrame | None

Original training dataset. Required if specs_file is not provided.

None
specs_file Path | None

File containing system specifications. Required if dataset is not provided.

None
classification bool

Whether the task is classification.

False
inverse_alloc bool

Whether to use inverse test allocation.

False
epsilon float

Boundary offset used for bva.

0.5
seed int

Random seed for reproducibility.

42
performance_threshold float

Minimum surrogate performance.

0.3
sample_limit int

Maximum number of surrogate samples.

50000
n_predictions int

Consecutive correct predictions needed.

50
max_depth int

Maximum Hoeffding tree depth.

5
hoeffding_tree_extra_params dict[str, Any] | None

Extra surrogate hyperparameters.

None
Source code in src/flowcean/testing/generator/ddtig/application/test_pipeline.py
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
def __init__(
    self,
    model: Model,
    n_testinputs: int,
    test_coverage_criterium: str,
    dataset: pl.DataFrame | None = None,
    specs_file: Path | None = None,
    *,
    classification: bool = False,
    inverse_alloc: bool = False,
    epsilon: float = 0.5,
    seed: int = 42,
    performance_threshold: float = 0.3,
    sample_limit: int = 50000,
    n_predictions: int = 50,
    max_depth: int = 5,
    hoeffding_tree_extra_params: dict[str, Any] | None = None,
) -> None:
    """Initializes the TestPipeline.

    Args:
        model: The trained Flowcean model.
        n_testinputs: Total number of test inputs to generate.
        test_coverage_criterium: Coverage strategy, either bva or dtc.
        dataset: Original training dataset.
            Required if specs_file is not provided.
        specs_file: File containing system specifications.
            Required if dataset is not provided.
        classification: Whether the task is classification.
        inverse_alloc: Whether to use inverse test allocation.
        epsilon: Boundary offset used for bva.
        seed: Random seed for reproducibility.
        performance_threshold: Minimum surrogate performance.
        sample_limit: Maximum number of surrogate samples.
        n_predictions: Consecutive correct predictions needed.
        max_depth: Maximum Hoeffding tree depth.
        hoeffding_tree_extra_params: Extra surrogate hyperparameters.
    """
    self.model_handler = ModelHandler(model)
    self.model = self.model_handler.get_ml_model()
    if test_coverage_criterium not in ["bva", "dtc"]:
        msg = "Invalid test coverage criterium. Expected 'bva' or 'dtc'."
        raise ValueError(msg)

    if (
        type(self.model) is not DecisionTreeRegressor
        and type(self.model) is not DecisionTreeClassifier
        and dataset is None
    ):
        msg = "Missing required parameter: 'dataset'"
        raise ValueError(msg)
    if dataset is None and specs_file is None:
        msg = "Missing required parameter: 'dataset' or 'specs_file'"
        raise ValueError(msg)
    self.n_testinputs = n_testinputs
    self.test_coverage_criterium = test_coverage_criterium
    self.dataset = dataset
    self.specs_handler = SystemSpecsHandler(
        data=dataset,
        specs_file=specs_file,
    )
    self.feature_names = self.specs_handler.extract_feature_names()
    self.hoeffding_tree = None
    self.classification = classification
    self.inverse_alloc = inverse_alloc
    self.seed = seed
    self.epsilon = epsilon
    self.performance_threshold = performance_threshold
    self.sample_limit = sample_limit
    self.n_predictions = n_predictions
    self.max_depth = max_depth
    self.hoeffding_tree_extra_params = (
        hoeffding_tree_extra_params
        if hoeffding_tree_extra_params is not None
        else {}
    )

execute()

Run test input generation with the initialized parameters.

Returns:

Type Description
DataFrame

Executable test inputs formatted for Flowcean.

Source code in src/flowcean/testing/generator/ddtig/application/test_pipeline.py
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
def execute(self) -> pl.DataFrame:
    """Run test input generation with the initialized parameters.

    Returns:
        Executable test inputs formatted for Flowcean.
    """
    return self._execute(
        test_coverage_criterium=self.test_coverage_criterium,
        n_testinputs=self.n_testinputs,
        inverse_alloc=self.inverse_alloc,
        epsilon=self.epsilon,
        performance_threshold=self.performance_threshold,
        sample_limit=self.sample_limit,
        n_predictions=self.n_predictions,
        max_depth=self.max_depth,
        **self.hoeffding_tree_extra_params,
    )