domain
TestTree(model_tree, specs_handler)
dataclass
Represents a tree structure used for generating test inputs.
Attributes:
test_tree: dict Dictionary representing the structure of a River or scikit-learn tree.
Methods:
get_n_samples() Returns the total number of samples used to train the tree.
Initializes the TestTree from a model tree.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_tree
|
HoeffdingTreeRegressor | HoeffdingTreeClassifier | Tree
|
A River or scikit-learn decision tree. |
required |
specs_handler
|
SystemSpecsHandler
|
Object for accessing feature specifications. |
required |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/base/tree.py
187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 | |
get_n_samples()
Returns the total number of samples used to train the tree.
Returns:
| Type | Description |
|---|---|
int
|
Total number of samples. |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/base/tree.py
213 214 215 216 217 218 219 220 221 222 223 | |
DataModel(data, seed, model_handler, specs_handler)
Generate synthetic samples from the training data distribution.
Attributes:
data: pl.DataFrame Original training data used in the Flowcean model.
list
Names of the columns in the training data.
int
Number of features in the dataset.
ModelHandler
ModelHandler object used to produce predictions.
list
List of indices for features of type int.
Methods:
generate_dataset() Generate random samples based on data distribution, or use original data.
Initializes the DataModel.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
Original training data used in the Flowcean model. |
required |
seed
|
int
|
Random seed for reproducibility. |
required |
model_handler
|
ModelHandler
|
ModelHandler object used to produce predictions. |
required |
specs_handler
|
SystemSpecsHandler
|
SystemSpecsHandler object storing system specifications. |
required |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/mut/data_model.py
46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | |
generate_dataset(*, original_data=False, n_samples=0)
Generates a dataset of inputs and corresponding model predictions.
If original_data is True, uses the original training data. Otherwise, generates synthetic samples using KDE.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
original_data
|
bool
|
Whether to use original training data or generate synthetic samples. |
False
|
n_samples
|
int
|
Number of synthetic samples to generate. |
0
|
Returns:
| Name | Type | Description |
|---|---|---|
list
|
List of tuples containing input dictionaries and model outputs. |
|
Example |
n_samples = 1
|
|
list
|
[({'Length': 0.5093, 'Diameter': 0.3886, |
|
list
|
'Height': 0.1106, 'M': 0}, 8.6006)] |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/mut/data_model.py
132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 | |
HoeffdingTree(inputs, seed, model_handler, specs_handler)
Train a Hoeffding Tree on synthetic samples.
Samples are generated from another model.
Attributes:
datamodel: DataModel Object used to generate synthetic training inputs based on the original dataset.
list
Original training inputs transformed to River-compatible format with predictions.
list
List of indices for nominal features.
Methods:
train_tree() Trains a Hoeffding Tree and returns the trained model.
Initializes the HoeffdingTree trainer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
inputs
|
DataFrame
|
Original training dataset including target column. |
required |
seed
|
int
|
Random seed for reproducible synthetic sample generation. |
required |
model_handler
|
ModelHandler
|
Object used to generate predictions from the Flowcean model. |
required |
specs_handler
|
SystemSpecsHandler
|
Object containing feature specifications and metadata. |
required |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/mut/hoeffding_tree.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | |
train_tree(performance_threshold, sample_limit, n_predictions, *, classification, **kwargs)
Train a Hoeffding Tree using synthetic samples.
Continue until performance criteria are met.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
performance_threshold
|
float
|
Minimum performance required to finalize the model. |
required |
sample_limit
|
int
|
Maximum number of samples to use during training. |
required |
n_predictions
|
int
|
Number of consecutive correct predictions required to stop training. |
required |
classification
|
bool
|
Indicates whether the task is classification or regression. |
required |
**kwargs
|
Any
|
Additional hyperparameters for the Hoeffding Tree model. |
{}
|
Returns:
| Type | Description |
|---|---|
HoeffdingTreeRegressor | HoeffdingTreeClassifier
|
Trained Hoeffding Tree model. |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/mut/hoeffding_tree.py
159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 | |
EquivalenceClassesHandler(test_tree, minmax_values_specs, n_features)
A class used to extract equivalence classes from a decision tree.
Attributes:
test_tree: TestTree The decision tree structure.
dict
Dictionary storing min/max values for each feature from specifications.
int
Number of samples used to train the tree.
int
Number of features in the dataset.
Methods:
get_equivalence_classes() Extracts and formats equivalence classes from the decision tree.
to_str(eqclass) Converts a single equivalence class to a string.
to_strs(eqclasses, feature_names) Converts a list of equivalence classes to a readable string format.
Initializes the EquivalenceClassesHandler.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
test_tree
|
Any
|
The decision tree used for extracting equivalence classes. |
required |
minmax_values_specs
|
dict
|
Dictionary containing min/max values for each feature. |
required |
n_features
|
int
|
Number of features in the dataset. |
required |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/eqclass_handler.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | |
get_equivalence_classes()
Extracts and formats equivalence classes from the decision tree.
Returns:
| Type | Description |
|---|---|
list
|
List of formatted equivalence classes. |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/eqclass_handler.py
183 184 185 186 187 188 189 190 191 192 193 194 195 196 | |
to_str(eqclass)
staticmethod
Converts a single equivalence class to a string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
eqclass
|
tuple
|
A tuple of Interval objects. |
required |
Returns:
| Type | Description |
|---|---|
str
|
String representation of the equivalence class. |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/eqclass_handler.py
198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 | |
to_strs(eqclasses, feature_names)
staticmethod
Converts a list of equivalence classes to a readable string format.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
eqclasses
|
list
|
List of equivalence classes. |
required |
feature_names
|
list
|
List of feature names. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Formatted string of all equivalence classes. |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/eqclass_handler.py
216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 | |
is_subset(eqclass1, eqclass2)
staticmethod
Compares interval ranges of features between classes.
Compares the interval ranges of all features between two equivalence classes and determines which one is a subset of the other.
An equivalence class is considered a subset only if all its intervals are strictly contained within the corresponding intervals of the other class. The method returns the superset equivalence class if such a relationship exists.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
eqclass1
|
tuple
|
First equivalence class (tuple of Interval objects). |
required |
eqclass2
|
tuple
|
Second equivalence class (tuple of Interval objects). |
required |
Returns:
| Type | Description |
|---|---|
tuple | None
|
The equivalence class that is the superset, |
tuple | None
|
or None if neither is a subset of the other. |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/eqclass_handler.py
242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 | |
Interval(feature, left_endpoint, right_endpoint, min_value, max_value)
Represents an interval for a specific feature.
The interval belongs to one equivalence class.
Attributes:
feature: int Index of the feature to which the interval belongs.
IntervalEndpoint
Indicates whether the interval is left-open or left-closed.
IntervalEndpoint
Indicates whether the interval is right-open or right-closed.
int | float
Lower bound of the interval.
int | float
Upper bound of the interval.
Initializes an Interval object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
feature
|
int
|
Index of the feature to which the interval belongs. |
required |
left_endpoint
|
IntervalEndpoint
|
Left endpoint type ('(' for open, '[' for closed). |
required |
right_endpoint
|
IntervalEndpoint
|
Right endpoint type (')' for open, ']' for closed). |
required |
min_value
|
float
|
Lower bound of the interval. |
required |
max_value
|
float
|
Upper bound of the interval. |
required |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/interval.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | |
__str__()
Returns a string representation of the interval.
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/interval.py
53 54 55 56 57 58 59 60 61 | |
is_closed()
Checks if the interval is fully closed [a, b].
Returns:
| Type | Description |
|---|---|
bool
|
True if both endpoints are closed. |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/interval.py
63 64 65 66 67 68 69 70 71 72 | |
is_open()
Checks if the interval is fully open (a, b).
Returns:
| Type | Description |
|---|---|
bool
|
True if both endpoints are open. |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/interval.py
74 75 76 77 78 79 80 81 82 83 | |
is_right_open()
Checks if the interval is right-open [a, b).
Returns:
| Type | Description |
|---|---|
bool
|
True if left is closed and right is open. |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/interval.py
85 86 87 88 89 90 91 92 93 94 | |
is_left_open()
Checks if the interval is left-open (a, b].
Returns:
| Type | Description |
|---|---|
bool
|
True if left is open and right is closed. |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/interval.py
96 97 98 99 100 101 102 103 104 105 | |
is_subset(interval_a, interval_b)
staticmethod
Determines which interval is a subset of the other.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
interval_a
|
Interval
|
First interval to compare. |
required |
interval_b
|
Interval
|
Second interval to compare. |
required |
Returns:
| Type | Description |
|---|---|
Interval | None
|
The superset interval if one contains the other, |
Interval | None
|
otherwise None. |
Source code in src/flowcean/testing/generator/ddtig/domain/model_analyser/surrogate/interval.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 | |
IntervalEndpoint
Bases: Enum
Enum representing the types of interval endpoints.
TestCompiler(n_features, testinputs)
Transforms abstract test inputs into executable test inputs.
Compatible with Flowcean models.
Attributes:
n_features: int Number of features in the dataset.
list
List of abstract test inputs.
Methods:
compute_executable_testinputs() Converts abstract test inputs into a polars DataFrame for execution.
Initializes the TestCompiler.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_features
|
int
|
Number of features in the dataset. |
required |
testinputs
|
list
|
List of abstract test inputs. |
required |
Source code in src/flowcean/testing/generator/ddtig/domain/test_generator/testcomp.py
23 24 25 26 27 28 29 30 31 32 33 34 35 | |
compute_executable_testinputs(feature_names)
Convert abstract test inputs into a Polars DataFrame.
Thus, the result can be executed on Flowcean models.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
feature_names
|
list
|
List of feature names in order of their indices. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame where each column represents a feature |
DataFrame
|
and each row represents a test input. |
Source code in src/flowcean/testing/generator/ddtig/domain/test_generator/testcomp.py
44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | |
TestGenerator(equivalence_classes, seed, type_specs)
A class that generates abstract test inputs for binary decision trees.
Attributes:
equivalence_classes: list Equivalence classes extracted from the decision tree.
dict
Input types for each feature as defined in the specifications.
list
List of test plans used to sample test inputs.
list
Number of test inputs to generate for each equivalence class.
Methods:
generate_testinputs() Generates abstract test inputs based on the selected coverage strategy.
Initializes the Test Generator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
equivalence_classes
|
list
|
Equivalence classes extracted from the decision tree. |
required |
seed
|
int
|
The random seed to use for reproducible test input generation. |
required |
type_specs
|
dict
|
Input types for each feature from the specifications. |
required |
Source code in src/flowcean/testing/generator/ddtig/domain/test_generator/testgen.py
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | |
generate_testinputs(test_coverage_criterium, eqclass_prio, n_testinputs, *, inverse_alloc, epsilon)
Generates abstract test inputs for all equivalence classes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
test_coverage_criterium
|
str
|
Coverage strategy ("bva" or "dtc"). |
required |
eqclass_prio
|
list
|
Importance scores for each equivalence class. |
required |
n_testinputs
|
int
|
Total number of test inputs to generate. |
required |
inverse_alloc
|
bool
|
If True, allocate more inputs to less important classes. |
required |
epsilon
|
float
|
Offset for BVA sampling. |
required |
Returns:
| Type | Description |
|---|---|
list
|
List of abstract test inputs. |
list
|
Each test input is a tuple of feature values. |
list
|
E.g.: [(1,2,3), (11,22,33), (87,29,38)] |
Source code in src/flowcean/testing/generator/ddtig/domain/test_generator/testgen.py
220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 | |