test_pipeline

`TestPipeline(model, n_testinputs, test_coverage_criterium, dataset=None, specs_file=None, *, classification=False, inverse_alloc=False, epsilon=0.5, seed=42, performance_threshold=0.3, sample_limit=50000, n_predictions=50, max_depth=5, hoeffding_tree_extra_params=None)`

Workflow for test input generation in Flowcean.

Attributes:

model_handler: ModelHandler Handles the Flowcean model and its predictions.

Decision Tree | Black-box Model

Underlying machine learning model extracted from Flowcean model.

pl.DataFrame

The original training dataset.

SystemSpecsHandler

Extracts system specifications and feature information.

dict

Test requirements provided by the user.

bool

Indicates whether the task is classification.

list

List of all equivalence classes.

list

List of all test plans (intervals used to sample test inputs).

list

List of all generated test inputs.

list

Number of test inputs to generate per equivalence class.

pl.DataFrame

Executable test inputs formatted for Flowcean.

list

Names of all input features.

HoeffdingTreeRegressor

Hoeffding tree used to approximate complex black-box models.

Methods:

execute() Executes the full test input generation workflow.

save_hoeffding_tree() Saves the generated Hoeffding tree to a file.

save_test_overview() Saves intermediate results and generated outputs.

Initializes the TestPipeline.

Parameters:

Name	Type	Description	Default
`model`	`Model`	The trained Flowcean model.	required
`n_testinputs`	`int`	Total number of test inputs to generate.	required
`test_coverage_criterium`	`str`	Coverage strategy, either bva or dtc.	required
`dataset`	`DataFrame \| None`	Original training dataset. Required if specs_file is not provided.	`None`
`specs_file`	`Path \| None`	File containing system specifications. Required if dataset is not provided.	`None`
`classification`	`bool`	Whether the task is classification.	`False`
`inverse_alloc`	`bool`	Whether to use inverse test allocation.	`False`
`epsilon`	`float`	Boundary offset used for bva.	`0.5`
`seed`	`int`	Random seed for reproducibility.	`42`
`performance_threshold`	`float`	Minimum surrogate performance.	`0.3`
`sample_limit`	`int`	Maximum number of surrogate samples.	`50000`
`n_predictions`	`int`	Consecutive correct predictions needed.	`50`
`max_depth`	`int`	Maximum Hoeffding tree depth.	`5`
`hoeffding_tree_extra_params`	`dict[str, Any] \| None`	Extra surrogate hyperparameters.	`None`

Source code in src/flowcean/testing/generator/ddtig/application/test_pipeline.py

def __init__(
    self,
    model: Model,
    n_testinputs: int,
    test_coverage_criterium: str,
    dataset: pl.DataFrame | None = None,
    specs_file: Path | None = None,
    *,
    classification: bool = False,
    inverse_alloc: bool = False,
    epsilon: float = 0.5,
    seed: int = 42,
    performance_threshold: float = 0.3,
    sample_limit: int = 50000,
    n_predictions: int = 50,
    max_depth: int = 5,
    hoeffding_tree_extra_params: dict[str, Any] | None = None,
) -> None:
    """Initializes the TestPipeline.

    Args:
        model: The trained Flowcean model.
        n_testinputs: Total number of test inputs to generate.
        test_coverage_criterium: Coverage strategy, either bva or dtc.
        dataset: Original training dataset.
            Required if specs_file is not provided.
        specs_file: File containing system specifications.
            Required if dataset is not provided.
        classification: Whether the task is classification.
        inverse_alloc: Whether to use inverse test allocation.
        epsilon: Boundary offset used for bva.
        seed: Random seed for reproducibility.
        performance_threshold: Minimum surrogate performance.
        sample_limit: Maximum number of surrogate samples.
        n_predictions: Consecutive correct predictions needed.
        max_depth: Maximum Hoeffding tree depth.
        hoeffding_tree_extra_params: Extra surrogate hyperparameters.
    """
    self.model_handler = ModelHandler(model)
    self.model = self.model_handler.get_ml_model()
    if test_coverage_criterium not in ["bva", "dtc"]:
        msg = "Invalid test coverage criterium. Expected 'bva' or 'dtc'."
        raise ValueError(msg)

    if (
        type(self.model) is not DecisionTreeRegressor
        and type(self.model) is not DecisionTreeClassifier
        and dataset is None
    ):
        msg = "Missing required parameter: 'dataset'"
        raise ValueError(msg)
    if dataset is None and specs_file is None:
        msg = "Missing required parameter: 'dataset' or 'specs_file'"
        raise ValueError(msg)
    self.n_testinputs = n_testinputs
    self.test_coverage_criterium = test_coverage_criterium
    self.dataset = dataset
    self.specs_handler = SystemSpecsHandler(
        data=dataset,
        specs_file=specs_file,
    )
    self.feature_names = self.specs_handler.extract_feature_names()
    self.hoeffding_tree = None
    self.classification = classification
    self.inverse_alloc = inverse_alloc
    self.seed = seed
    self.epsilon = epsilon
    self.performance_threshold = performance_threshold
    self.sample_limit = sample_limit
    self.n_predictions = n_predictions
    self.max_depth = max_depth
    self.hoeffding_tree_extra_params = (
        hoeffding_tree_extra_params
        if hoeffding_tree_extra_params is not None
        else {}
    )

`execute()`

Run test input generation with the initialized parameters.

Returns:

Type	Description
`DataFrame`	Executable test inputs formatted for Flowcean.

Source code in src/flowcean/testing/generator/ddtig/application/test_pipeline.py

def execute(self) -> pl.DataFrame:
    """Run test input generation with the initialized parameters.

    Returns:
        Executable test inputs formatted for Flowcean.
    """
    return self._execute(
        test_coverage_criterium=self.test_coverage_criterium,
        n_testinputs=self.n_testinputs,
        inverse_alloc=self.inverse_alloc,
        epsilon=self.epsilon,
        performance_threshold=self.performance_threshold,
        sample_limit=self.sample_limit,
        n_predictions=self.n_predictions,
        max_depth=self.max_depth,
        **self.hoeffding_tree_extra_params,
    )