Skip to content

domain

TestTree(model_tree, specs_handler) dataclass

Represents a tree structure used for generating test inputs.

Attributes:

test_tree: dict Dictionary representing the structure of a River or scikit-learn tree.

Methods:

get_n_samples() Returns the total number of samples used to train the tree.

Initializes the TestTree from a model tree.

Parameters:

Name Type Description Default
model_tree HoeffdingTreeRegressor | HoeffdingTreeClassifier | Tree

A River or scikit-learn decision tree.

required
specs_handler SystemSpecsHandler

Object for accessing feature specifications.

required
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/base/tree.py
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
def __init__(
    self,
    model_tree: HoeffdingTreeRegressor
    | HoeffdingTreeClassifier
    | sklearnTree,
    specs_handler: SystemSpecsHandler,
) -> None:
    """Initializes the TestTree from a model tree.

    Args:
        model_tree: A River or scikit-learn decision tree.
        specs_handler: Object for accessing feature specifications.
    """
    if isinstance(model_tree, sklearnTree):
        feature_dict = (
            specs_handler.extract_feature_names_with_idx_reversed()
        )
        self.test_tree = convert_sklearn_tree(model_tree, feature_dict)
        logger.info(
            "Converted a scikit-learn tree to TestTree successfully.",
        )
    else:
        feature_dict = specs_handler.extract_feature_names_with_idx()
        self.test_tree = convert_river_tree(model_tree, feature_dict)
        logger.info("Converted a River tree to TestTree successfully.")

get_n_samples()

Returns the total number of samples used to train the tree.

Returns:

Type Description
int

Total number of samples.

Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/base/tree.py
213
214
215
216
217
218
219
220
221
222
223
def get_n_samples(self) -> int:
    """Returns the total number of samples used to train the tree.

    Returns:
        Total number of samples.
    """
    samples = 0
    for key in self.test_tree:
        if self.test_tree[key].samples != 0:
            samples += self.test_tree[key].samples
    return samples

DataModel(data, seed, model_handler, specs_handler)

Generate synthetic samples from the training data distribution.

Attributes:

data: pl.DataFrame Original training data used in the Flowcean model.

list

Names of the columns in the training data.

int

Number of features in the dataset.

ModelHandler

ModelHandler object used to produce predictions.

list

List of indices for features of type int.

Methods:

generate_dataset() Generate random samples based on data distribution, or use original data.

Initializes the DataModel.

Parameters:

Name Type Description Default
data DataFrame

Original training data used in the Flowcean model.

required
seed int

Random seed for reproducibility.

required
model_handler ModelHandler

ModelHandler object used to produce predictions.

required
specs_handler SystemSpecsHandler

SystemSpecsHandler object storing system specifications.

required
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/mut/data_model.py
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
def __init__(
    self,
    data: pl.DataFrame,
    seed: int,
    model_handler: ModelHandler,
    specs_handler: SystemSpecsHandler,
) -> None:
    """Initializes the DataModel.

    Args:
        data: Original training data used in the Flowcean model.
        seed: Random seed for reproducibility.
        model_handler: ModelHandler object used to produce predictions.
        specs_handler: SystemSpecsHandler object storing
            system specifications.
    """
    self.data = data
    self.seed = seed
    self.col_names = data.columns
    self.model_handler = model_handler

    self.n_features = specs_handler.get_n_features()
    self.int_features = specs_handler.get_int_features()

generate_dataset(*, original_data=False, n_samples=0)

Generates a dataset of inputs and corresponding model predictions.

If original_data is True, uses the original training data. Otherwise, generates synthetic samples using KDE.

Parameters:

Name Type Description Default
original_data bool

Whether to use original training data or generate synthetic samples.

False
n_samples int

Number of synthetic samples to generate.

0

Returns:

Name Type Description
list

List of tuples containing input dictionaries and model outputs.

Example n_samples = 1
list

[({'Length': 0.5093, 'Diameter': 0.3886,

list

'Height': 0.1106, 'M': 0}, 8.6006)]

Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/mut/data_model.py
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
def generate_dataset(
    self,
    *,
    original_data: bool = False,
    n_samples: int = 0,
) -> list:
    """Generates a dataset of inputs and corresponding model predictions.

    If original_data is True, uses the original training data.
    Otherwise, generates synthetic samples using KDE.

    Args:
        original_data: Whether to use original training
            data or generate synthetic samples.
        n_samples: Number of synthetic samples to generate.

    Returns:
        List of tuples containing input dictionaries and model outputs.
        Example (n_samples = 1):
        [({'Length': 0.5093, 'Diameter': 0.3886,
        'Height': 0.1106, 'M': 0}, 8.6006)]
    """
    training_inputs = (
        self.data
        if original_data
        else self._generate_samples(n_samples, self.int_features)
    )
    training_outputs = self.model_handler.get_model_prediction(
        training_inputs,
    ).collect()
    samples_input_lst = training_inputs.to_dicts()
    samples_output_lst = pl.Series(
        training_outputs.select(training_outputs.columns[0]),
    ).to_list()
    return [
        (inputs, output)
        for inputs, output in zip(
            samples_input_lst,
            samples_output_lst,
            strict=False,
        )
    ]

HoeffdingTree(inputs, seed, model_handler, specs_handler)

Train a Hoeffding Tree on synthetic samples.

Samples are generated from another model.

Attributes:

datamodel: DataModel Object used to generate synthetic training inputs based on the original dataset.

list

Original training inputs transformed to River-compatible format with predictions.

list

List of indices for nominal features.

Methods:

train_tree() Trains a Hoeffding Tree and returns the trained model.

Initializes the HoeffdingTree trainer.

Parameters:

Name Type Description Default
inputs DataFrame

Original training dataset including target column.

required
seed int

Random seed for reproducible synthetic sample generation.

required
model_handler ModelHandler

Object used to generate predictions from the Flowcean model.

required
specs_handler SystemSpecsHandler

Object containing feature specifications and metadata.

required
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/mut/hoeffding_tree.py
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
def __init__(
    self,
    inputs: pl.DataFrame,
    seed: int,
    model_handler: ModelHandler,
    specs_handler: SystemSpecsHandler,
) -> None:
    """Initializes the HoeffdingTree trainer.

    Args:
        inputs: Original training dataset including target column.
        seed: Random seed for reproducible synthetic sample generation.
        model_handler: Object used to generate predictions from
            the Flowcean model.
        specs_handler: Object containing feature specifications
            and metadata.
    """
    # Remove target column to isolate input features
    inputs = inputs.drop(inputs.columns[-1])
    self.datamodel = DataModel(inputs, seed, model_handler, specs_handler)

    # Generate River-compatible samples using original data
    self.samples = self.datamodel.generate_dataset(original_data=True)
    self.nominal_attributes = specs_handler.get_nominal_features()

train_tree(performance_threshold, sample_limit, n_predictions, *, classification, **kwargs)

Train a Hoeffding Tree using synthetic samples.

Continue until performance criteria are met.

Parameters:

Name Type Description Default
performance_threshold float

Minimum performance required to finalize the model.

required
sample_limit int

Maximum number of samples to use during training.

required
n_predictions int

Number of consecutive correct predictions required to stop training.

required
classification bool

Indicates whether the task is classification or regression.

required
**kwargs Any

Additional hyperparameters for the Hoeffding Tree model.

{}

Returns:

Type Description
HoeffdingTreeRegressor | HoeffdingTreeClassifier

Trained Hoeffding Tree model.

Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/mut/hoeffding_tree.py
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
def train_tree(
    self,
    performance_threshold: float,
    sample_limit: int,
    n_predictions: int,
    *,
    classification: bool,
    **kwargs: Any,
) -> HoeffdingTreeRegressor | HoeffdingTreeClassifier:
    """Train a Hoeffding Tree using synthetic samples.

    Continue until performance criteria are met.

    Args:
        performance_threshold: Minimum performance required to
            finalize the model.
        sample_limit: Maximum number of samples to use during training.
        n_predictions: Number of consecutive correct predictions
            required to stop training.
        classification: Indicates whether the task is
            classification or regression.
        **kwargs: Additional hyperparameters for the Hoeffding Tree model.

    Returns:
        Trained Hoeffding Tree model.
    """
    metric, model = self._create_model_and_metric(
        classification=classification,
        **kwargs,
    )

    # Pre-train
    for x, y in self.samples:
        y_true = self._normalize_target(y, classification=classification)
        model.learn_one(x, y_true)

    self._run_training_loop(
        model=model,
        metric=metric,
        performance_threshold=performance_threshold,
        n_predictions=n_predictions,
        sample_limit=sample_limit,
        classification=classification,
    )

    logger.info("Hoeffding Tree training completed successfully.")
    return model

EquivalenceClassesHandler(test_tree, minmax_values_specs, n_features)

A class used to extract equivalence classes from a decision tree.

Attributes:

test_tree: TestTree The decision tree structure.

dict

Dictionary storing min/max values for each feature from specifications.

int

Number of samples used to train the tree.

int

Number of features in the dataset.

Methods:

get_equivalence_classes() Extracts and formats equivalence classes from the decision tree.

to_str(eqclass) Converts a single equivalence class to a string.

to_strs(eqclasses, feature_names) Converts a list of equivalence classes to a readable string format.

Initializes the EquivalenceClassesHandler.

Parameters:

Name Type Description Default
test_tree Any

The decision tree used for extracting equivalence classes.

required
minmax_values_specs dict

Dictionary containing min/max values for each feature.

required
n_features int

Number of features in the dataset.

required
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/eqclass_handler.py
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
def __init__(
    self,
    test_tree: Any,
    minmax_values_specs: dict,
    n_features: int,
) -> None:
    """Initializes the EquivalenceClassesHandler.

    Args:
        test_tree: The decision tree used for extracting
            equivalence classes.
        minmax_values_specs: Dictionary containing min/max
            values for each feature.
        n_features: Number of features in the dataset.
    """
    self.test_tree = test_tree.test_tree
    self.n_samples = test_tree.get_n_samples()
    self.minmax_values_specs = minmax_values_specs
    self.n_features = n_features
    self.eqclass_prio = []

get_equivalence_classes()

Extracts and formats equivalence classes from the decision tree.

Returns:

Type Description
list

List of formatted equivalence classes.

Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/eqclass_handler.py
183
184
185
186
187
188
189
190
191
192
193
194
195
196
def get_equivalence_classes(self) -> list:
    """Extracts and formats equivalence classes from the decision tree.

    Returns:
        List of formatted equivalence classes.
    """
    self.eqclass_prio = []
    paths = self._collect_all_paths(self.ROOT_INDEX)
    equivalence_classes = self._extract_equivalence_classes(paths)
    equivalence_classes_formatted = self._format_equivalence_classes(
        equivalence_classes,
    )
    logger.info("Extracted equivalence classes successfully.")
    return equivalence_classes_formatted

to_str(eqclass) staticmethod

Converts a single equivalence class to a string.

Parameters:

Name Type Description Default
eqclass tuple

A tuple of Interval objects.

required

Returns:

Type Description
str

String representation of the equivalence class.

Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/eqclass_handler.py
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
@staticmethod
def to_str(eqclass: tuple) -> str:
    """Converts a single equivalence class to a string.

    Args:
        eqclass: A tuple of Interval objects.

    Returns:
        String representation of the equivalence class.
    """
    eqclass_str = "("
    eqclass_str += eqclass[0].__str__()
    for interval in eqclass[1:]:
        eqclass_str += ", "
        eqclass_str += interval.__str__()
    eqclass_str += ")"
    return eqclass_str

to_strs(eqclasses, feature_names) staticmethod

Converts a list of equivalence classes to a readable string format.

Parameters:

Name Type Description Default
eqclasses list

List of equivalence classes.

required
feature_names list

List of feature names.

required

Returns:

Type Description
str

Formatted string of all equivalence classes.

Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/eqclass_handler.py
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
@staticmethod
def to_strs(eqclasses: list, feature_names: list) -> str:
    """Converts a list of equivalence classes to a readable string format.

    Args:
        eqclasses: List of equivalence classes.
        feature_names: List of feature names.

    Returns:
        Formatted string of all equivalence classes.
    """
    eqclasses_str = ""
    for i in range(len(eqclasses)):
        feature_idx = 0
        eqclasses_str += f"Equivalence class {i}:\n"
        eqclasses_str += "{"
        eqclasses_str += f"{feature_names[feature_idx]}: "
        eqclasses_str += eqclasses[i][feature_idx].__str__()
        for interval in eqclasses[i][1:]:
            feature_idx += 1
            eqclasses_str += ", "
            eqclasses_str += f"{feature_names[feature_idx]}: "
            eqclasses_str += interval.__str__()
        eqclasses_str += "}\n"
    return eqclasses_str

is_subset(eqclass1, eqclass2) staticmethod

Compares interval ranges of features between classes.

Compares the interval ranges of all features between two equivalence classes and determines which one is a subset of the other.

An equivalence class is considered a subset only if all its intervals are strictly contained within the corresponding intervals of the other class. The method returns the superset equivalence class if such a relationship exists.

Parameters:

Name Type Description Default
eqclass1 tuple

First equivalence class (tuple of Interval objects).

required
eqclass2 tuple

Second equivalence class (tuple of Interval objects).

required

Returns:

Type Description
tuple | None

The equivalence class that is the superset,

tuple | None

or None if neither is a subset of the other.

Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/eqclass_handler.py
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
@staticmethod
def is_subset(eqclass1: tuple, eqclass2: tuple) -> tuple | None:
    """Compares interval ranges of features between classes.

    Compares the interval ranges of all features between two equivalence
    classes and determines which one is a subset of the other.

    An equivalence class is considered a subset only if all its
    intervals are strictly contained within the corresponding
    intervals of the other class. The method returns the
    superset equivalence class if such a relationship exists.

    Args:
        eqclass1: First equivalence class (tuple of Interval objects).
        eqclass2: Second equivalence class (tuple of Interval objects).

    Returns:
        The equivalence class that is the superset,
        or None if neither is a subset of the other.
    """
    from flowcean.testing.generator.ddtig.domain import Interval

    # Compare the first interval to determine initial superset
    interval_a = eqclass1[0]
    interval_b = eqclass2[0]
    interval_res = Interval.is_subset(interval_a, interval_b)

    if interval_res is None:
        return None

    # Identify which equivalence class contains the superset interval
    eqclass_large = eqclass1 if interval_res == interval_a else eqclass2

    # Check consistency across all remaining intervals
    for idx in range(1, len(eqclass1)):
        interval_a = eqclass1[idx]
        interval_b = eqclass2[idx]
        interval_res = Interval.is_subset(interval_a, interval_b)

        if interval_res is None or (
            ((interval_res == interval_a) and (eqclass_large != eqclass1))
            or (
                (interval_res == interval_b)
                and (eqclass_large != eqclass2)
            )
        ):
            return None

    return eqclass_large

Interval(feature, left_endpoint, right_endpoint, min_value, max_value)

Represents an interval for a specific feature.

The interval belongs to one equivalence class.

Attributes:

feature: int Index of the feature to which the interval belongs.

IntervalEndpoint

Indicates whether the interval is left-open or left-closed.

IntervalEndpoint

Indicates whether the interval is right-open or right-closed.

int | float

Lower bound of the interval.

int | float

Upper bound of the interval.

Initializes an Interval object.

Parameters:

Name Type Description Default
feature int

Index of the feature to which the interval belongs.

required
left_endpoint IntervalEndpoint

Left endpoint type ('(' for open, '[' for closed).

required
right_endpoint IntervalEndpoint

Right endpoint type (')' for open, ']' for closed).

required
min_value float

Lower bound of the interval.

required
max_value float

Upper bound of the interval.

required
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/interval.py
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
def __init__(
    self,
    feature: int,
    left_endpoint: IntervalEndpoint,
    right_endpoint: IntervalEndpoint,
    min_value: float,
    max_value: float,
) -> None:
    """Initializes an Interval object.

    Args:
        feature: Index of the feature to which the interval belongs.
        left_endpoint: Left endpoint type ('(' for open, '[' for closed).
        right_endpoint: Right endpoint type
            (')' for open, ']' for closed).
        min_value: Lower bound of the interval.
        max_value: Upper bound of the interval.
    """
    self.feature = feature
    self.left_endpoint = left_endpoint
    self.right_endpoint = right_endpoint
    self.min_value = min_value
    self.max_value = max_value

__str__()

Returns a string representation of the interval.

Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/interval.py
53
54
55
56
57
58
59
60
61
def __str__(self) -> str:
    """Returns a string representation of the interval."""
    return (
        self.left_endpoint.value
        + str(self.min_value)
        + ","
        + str(self.max_value)
        + self.right_endpoint.value
    )

is_closed()

Checks if the interval is fully closed [a, b].

Returns:

Type Description
bool

True if both endpoints are closed.

Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/interval.py
63
64
65
66
67
68
69
70
71
72
def is_closed(self) -> bool:
    """Checks if the interval is fully closed [a, b].

    Returns:
        True if both endpoints are closed.
    """
    return (
        self.left_endpoint == IntervalEndpoint.LEFT_CLOSED
        and self.right_endpoint == IntervalEndpoint.RIGHT_CLOSED
    )

is_open()

Checks if the interval is fully open (a, b).

Returns:

Type Description
bool

True if both endpoints are open.

Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/interval.py
74
75
76
77
78
79
80
81
82
83
def is_open(self) -> bool:
    """Checks if the interval is fully open (a, b).

    Returns:
        True if both endpoints are open.
    """
    return (
        self.left_endpoint == IntervalEndpoint.LEFT_OPEN
        and self.right_endpoint == IntervalEndpoint.RIGHT_OPEN
    )

is_right_open()

Checks if the interval is right-open [a, b).

Returns:

Type Description
bool

True if left is closed and right is open.

Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/interval.py
85
86
87
88
89
90
91
92
93
94
def is_right_open(self) -> bool:
    """Checks if the interval is right-open [a, b).

    Returns:
        True if left is closed and right is open.
    """
    return (
        self.left_endpoint == IntervalEndpoint.LEFT_CLOSED
        and self.right_endpoint == IntervalEndpoint.RIGHT_OPEN
    )

is_left_open()

Checks if the interval is left-open (a, b].

Returns:

Type Description
bool

True if left is open and right is closed.

Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/interval.py
 96
 97
 98
 99
100
101
102
103
104
105
def is_left_open(self) -> bool:
    """Checks if the interval is left-open (a, b].

    Returns:
        True if left is open and right is closed.
    """
    return (
        self.left_endpoint == IntervalEndpoint.RIGHT_CLOSED
        and self.right_endpoint == IntervalEndpoint.LEFT_OPEN
    )

is_subset(interval_a, interval_b) staticmethod

Determines which interval is a subset of the other.

Parameters:

Name Type Description Default
interval_a Interval

First interval to compare.

required
interval_b Interval

Second interval to compare.

required

Returns:

Type Description
Interval | None

The superset interval if one contains the other,

Interval | None

otherwise None.

Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/interval.py
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
@staticmethod
def is_subset(
    interval_a: Interval,
    interval_b: Interval,
) -> Interval | None:
    """Determines which interval is a subset of the other.

    Args:
        interval_a: First interval to compare.
        interval_b: Second interval to compare.

    Returns:
        The superset interval if one contains the other,
        otherwise None.
    """
    ordered = Interval._order_by_bounds(interval_a, interval_b)
    if ordered is None:
        return None

    interval_large, interval_small = ordered

    # Case 1: Strict containment
    if (
        interval_large.min_value < interval_small.min_value
        and interval_large.max_value > interval_small.max_value
    ):
        return interval_large

    # Case 2: Same lower bound, larger upper bound
    if (
        interval_large.min_value == interval_small.min_value
        and interval_large.max_value > interval_small.max_value
    ):
        left_ok = (
            interval_small.left_endpoint == IntervalEndpoint.LEFT_OPEN
            or interval_large.left_endpoint == IntervalEndpoint.LEFT_CLOSED
        )
        return interval_large if left_ok else None

    # Case 3: Smaller lower bound, same upper bound
    if (
        interval_large.min_value < interval_small.min_value
        and interval_large.max_value == interval_small.max_value
    ):
        right_ok = (
            interval_small.right_endpoint == IntervalEndpoint.RIGHT_OPEN
            or interval_large.right_endpoint
            == IntervalEndpoint.RIGHT_CLOSED
        )
        return interval_large if right_ok else None

    # Case 4: Same bounds
    if (
        interval_a.min_value == interval_b.min_value
        and interval_a.max_value == interval_b.max_value
    ):
        return Interval._same_bounds_superset(interval_a, interval_b)
    return None

IntervalEndpoint

Bases: Enum

Enum representing the types of interval endpoints.

TestCompiler(n_features, testinputs)

Transforms abstract test inputs into executable test inputs.

Compatible with Flowcean models.

Attributes:

n_features: int Number of features in the dataset.

list

List of abstract test inputs.

Methods:

compute_executable_testinputs() Converts abstract test inputs into a polars DataFrame for execution.

Initializes the TestCompiler.

Parameters:

Name Type Description Default
n_features int

Number of features in the dataset.

required
testinputs list

List of abstract test inputs.

required
Source code in src/flowcean/testing/generator/ddtig/domain/test_generator/testcomp.py
23
24
25
26
27
28
29
30
31
32
33
34
35
def __init__(
    self,
    n_features: int,
    testinputs: list,
) -> None:
    """Initializes the TestCompiler.

    Args:
        n_features: Number of features in the dataset.
        testinputs: List of abstract test inputs.
    """
    self.n_features = n_features
    self.abst_testinputs = testinputs

compute_executable_testinputs(feature_names)

Convert abstract test inputs into a Polars DataFrame.

Thus, the result can be executed on Flowcean models.

Parameters:

Name Type Description Default
feature_names list

List of feature names in order of their indices.

required

Returns:

Type Description
DataFrame

DataFrame where each column represents a feature

DataFrame

and each row represents a test input.

Source code in src/flowcean/testing/generator/ddtig/domain/test_generator/testcomp.py
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
def compute_executable_testinputs(
    self,
    feature_names: list,
) -> pl.DataFrame:
    """Convert abstract test inputs into a Polars DataFrame.

    Thus, the result can be executed on Flowcean models.

    Args:
        feature_names: List of feature names in order of their indices.

    Returns:
        DataFrame where each column represents a feature
        and each row represents a test input.
    """
    input_dict = self._init_input_dict()

    # Populate input dictionary with values from abstract test inputs
    for ati in self.abst_testinputs:
        for feature, value in enumerate(ati):
            input_dict[str(feature)].append(value)
    input_dict = dict(
        zip(feature_names, list(input_dict.values()), strict=False),
    )

    # Convert to polars DataFrame (Flowcean-compatible format)
    return pl.from_dict(input_dict, strict=False)

TestGenerator(equivalence_classes, seed, type_specs)

A class that generates abstract test inputs for binary decision trees.

Attributes:

equivalence_classes: list Equivalence classes extracted from the decision tree.

dict

Input types for each feature as defined in the specifications.

list

List of test plans used to sample test inputs.

list

Number of test inputs to generate for each equivalence class.

Methods:

generate_testinputs() Generates abstract test inputs based on the selected coverage strategy.

Initializes the Test Generator.

Parameters:

Name Type Description Default
equivalence_classes list

Equivalence classes extracted from the decision tree.

required
seed int

The random seed to use for reproducible test input generation.

required
type_specs dict

Input types for each feature from the specifications.

required
Source code in src/flowcean/testing/generator/ddtig/domain/test_generator/testgen.py
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
def __init__(
    self,
    equivalence_classes: list,
    seed: int,
    type_specs: dict,
) -> None:
    """Initializes the Test Generator.

    Args:
        equivalence_classes: Equivalence classes extracted from
            the decision tree.
        seed: The random seed to use for reproducible
            test input generation.
        type_specs: Input types for each feature from the specifications.
    """
    self.equivalence_classes = equivalence_classes
    self.type_specs = type_specs
    self.testplans = []
    random.seed(seed)

generate_testinputs(test_coverage_criterium, eqclass_prio, n_testinputs, *, inverse_alloc, epsilon)

Generates abstract test inputs for all equivalence classes.

Parameters:

Name Type Description Default
test_coverage_criterium str

Coverage strategy ("bva" or "dtc").

required
eqclass_prio list

Importance scores for each equivalence class.

required
n_testinputs int

Total number of test inputs to generate.

required
inverse_alloc bool

If True, allocate more inputs to less important classes.

required
epsilon float

Offset for BVA sampling.

required

Returns:

Type Description
list

List of abstract test inputs.

list

Each test input is a tuple of feature values.

list

E.g.: [(1,2,3), (11,22,33), (87,29,38)]

Source code in src/flowcean/testing/generator/ddtig/domain/test_generator/testgen.py
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
def generate_testinputs(
    self,
    test_coverage_criterium: str,
    eqclass_prio: list,
    n_testinputs: int,
    *,
    inverse_alloc: bool,
    epsilon: float,
) -> list:
    """Generates abstract test inputs for all equivalence classes.

    Args:
        test_coverage_criterium: Coverage strategy ("bva" or "dtc").
        eqclass_prio: Importance scores for each equivalence class.
        n_testinputs: Total number of test inputs to generate.
        inverse_alloc: If True, allocate more inputs to less
            important classes.
        epsilon: Offset for BVA sampling.

    Returns:
        List of abstract test inputs.
        Each test input is a tuple of feature values.
        E.g.: [(1,2,3), (11,22,33), (87,29,38)]
    """
    testinputs = []
    self.n_testinputs_lst = self._generate_n_testinputs_list(
        n_testinputs,
        eqclass_prio,
        inverse_alloc=inverse_alloc,
    )
    for eqclass, n_eq_testinputs in zip(
        self.equivalence_classes,
        self.n_testinputs_lst,
        strict=False,
    ):
        testinputs_eqclass = self._generate_testinputs_eqclass(
            n_eq_testinputs,
            test_coverage_criterium,
            eqclass,
            epsilon,
        )
        testinputs += testinputs_eqclass
    logger.info(
        "Generated test inputs for all equivalence classes successfully.",
    )
    return testinputs