sklearn
Accuracy(features=None)
Bases: SelectMixin, LazyMixin, Metric
Accuracy classification score.
As defined by scikit-learn.
Initialize metric.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features
|
list[str] | None
|
The features to calculate the metric for. If None, the metric uses all features in the data. |
None
|
Source code in src/flowcean/sklearn/metrics/classification.py
16 17 18 19 20 21 22 23 24 25 26 | |
ClassificationReport(features=None)
Bases: SelectMixin, LazyMixin, Metric
Build a text report showing the main classification metrics.
As defined by scikit-learn.
Initialize metric.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features
|
list[str] | None
|
The features to calculate the metric for. If None, the metric uses all features in the data. |
None
|
Source code in src/flowcean/sklearn/metrics/classification.py
39 40 41 42 43 44 45 46 47 48 49 | |
FBetaScore(*, beta=1.0, features=None)
Bases: SelectMixin, LazyMixin, Metric
F-beta score.
As defined by scikit-learn.
Initialize metric.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
beta
|
float
|
The beta parameter. |
1.0
|
features
|
list[str] | None
|
The features to calculate the metric for. If None, the metric uses all features in the data. |
None
|
Source code in src/flowcean/sklearn/metrics/classification.py
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | |
PrecisionScore(features=None)
Bases: SelectMixin, LazyMixin, Metric
Precision classification score.
As defined by scikit-learn.
Initialize metric.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features
|
list[str] | None
|
The features to calculate the metric for. If None, the metric uses all features in the data. |
None
|
Source code in src/flowcean/sklearn/metrics/classification.py
91 92 93 94 95 96 97 98 99 100 101 | |
Recall(features=None)
Bases: SelectMixin, LazyMixin, Metric
Recall classification score.
As defined by scikit-learn.
Initialize metric.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features
|
list[str] | None
|
The features to calculate the metric for. If None, the metric uses all features in the data. |
None
|
Source code in src/flowcean/sklearn/metrics/classification.py
114 115 116 117 118 119 120 121 122 123 124 | |
MaxError(feature=None)
Bases: SelectMixin, LazyMixin, Metric
Max error regression loss.
As defined by scikit-learn.
Initialize MaxError metric.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
feature
|
str | None
|
The feature to calculate the metric for. If None, the metric expects a single feature in the data. |
None
|
Source code in src/flowcean/sklearn/metrics/regression.py
19 20 21 22 23 24 25 26 27 | |
MeanAbsoluteError(features=None, multioutput='raw_values')
Bases: SelectMixin, LazyMixin, MultiOutputMixin, Metric
Mean absolute error (MAE) regression loss.
As defined by scikit-learn.
Initialize metric.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features
|
list[str] | None
|
The features to calculate the metric for. If None, the metric uses all features in the data. |
None
|
multioutput
|
Literal['raw_values', 'uniform_average']
|
Defines how to aggregate multiple output values. See scikit-learn documentation for details. |
'raw_values'
|
Source code in src/flowcean/sklearn/metrics/regression.py
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | |
MeanAbsolutePercentageError(features=None, multioutput='raw_values')
Bases: SelectMixin, LazyMixin, MultiOutputMixin, Metric
Mean absolute percentage error (MAPE) regression loss.
As defined by scikit-learn.
Initialize metric.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features
|
list[str] | None
|
The features to calculate the metric for. If None, the metric uses all features in the data. |
None
|
multioutput
|
Literal['raw_values', 'uniform_average']
|
Defines how to aggregate multiple output values. See scikit-learn documentation for details. |
'raw_values'
|
Source code in src/flowcean/sklearn/metrics/regression.py
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 | |
MeanSquaredError(features=None, multioutput='raw_values')
Bases: SelectMixin, LazyMixin, MultiOutputMixin, Metric
Mean squared error (MSE) regression loss.
As defined by scikit-learn.
Initialize metric.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features
|
list[str] | None
|
The features to calculate the metric for. If None, the metric uses all features in the data. |
None
|
multioutput
|
Literal['raw_values', 'uniform_average']
|
Defines how to aggregate multiple output values. See scikit-learn documentation for details. |
'raw_values'
|
Source code in src/flowcean/sklearn/metrics/regression.py
133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 | |
R2Score(features=None, multioutput='raw_values')
Bases: SelectMixin, LazyMixin, MultiOutputMixin, Metric
R^2 (coefficient of determination) regression score.
As defined by scikit-learn.
Initialize metric.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features
|
list[str] | None
|
The features to calculate the metric for. If None, the metric uses all features in the data. |
None
|
multioutput
|
Literal['raw_values', 'uniform_average']
|
Defines how to aggregate multiple output values. See scikit-learn documentation for details. |
'raw_values'
|
Source code in src/flowcean/sklearn/metrics/regression.py
177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 | |
SciKitClassifierModel(estimator, *, output_names, threshold=0.5, name=None)
Bases: SciKitModel
A SciKit model for classifiers with probability predictions.
Supports threshold-based predictions via the threshold attribute and
exposes class probabilities via predict_proba. The estimator must
implement predict_proba.
Initialize the classifier model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
estimator
|
SupportsPredict
|
The scikit-learn classifier (must support
|
required |
output_names
|
Iterable[str]
|
The names of the output columns. |
required |
threshold
|
float
|
Decision threshold for the positive class (default: 0.5). |
0.5
|
name
|
str | None
|
The name of the model. |
None
|
Source code in src/flowcean/sklearn/model.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 | |
predict_proba(input_features)
Predict class probabilities, applying preprocessing transforms.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_features
|
DataFrame | LazyFrame
|
The inputs for which to predict probabilities. |
required |
Returns:
| Type | Description |
|---|---|
LazyFrame
|
The predicted probabilities for the positive class. |
Source code in src/flowcean/sklearn/model.py
125 126 127 128 129 130 131 132 133 134 135 136 137 138 | |
SciKitModel(estimator, *, output_names, name=None)
Bases: Model
A model that wraps a scikit-learn estimator.
Initialize the model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
estimator
|
SupportsPredict
|
The scikit-learn estimator. |
required |
output_names
|
Iterable[str]
|
The names of the output columns. |
required |
name
|
str | None
|
The name of the model. |
None
|
Source code in src/flowcean/sklearn/model.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | |
RandomForestRegressorLearner(n_estimators=100, *, criterion='squared_error', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=1.0, max_leaf_nodes=None, min_impurity_decrease=0.0, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False, ccp_alpha=0.0, max_samples=None, monotonic_cst=None, callbacks=None)
Bases: SupervisedLearner
Wrapper class for sklearn's RandomForestRegressor.
Reference: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html
Initialize the random forest learner.
Reference: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_estimators
|
int
|
Number of trees in the forest. |
100
|
criterion
|
str
|
Function to measure the quality of a split. |
'squared_error'
|
max_depth
|
int | None
|
Maximum depth of the tree. |
None
|
min_samples_split
|
int
|
Minimum number of samples required to split an internal node. |
2
|
min_samples_leaf
|
int
|
Minimum number of samples required to be at a leaf node. |
1
|
min_weight_fraction_leaf
|
float
|
Minimum weighted fraction of the sum total of weights required to be at a leaf node. |
0.0
|
max_features
|
float
|
Number of features to consider when looking for the best split. |
1.0
|
max_leaf_nodes
|
int | None
|
Grow trees with max_leaf_nodes in best-first fashion. |
None
|
min_impurity_decrease
|
float
|
A node will be split if this split induces a decrease of the impurity greater than or equal to this value. |
0.0
|
bootstrap
|
bool
|
Whether bootstrap samples are used when building trees. |
True
|
oob_score
|
bool
|
Whether to use out-of-bag samples to estimate the R^2 on unseen data. |
False
|
n_jobs
|
int | None
|
Number of jobs to run in parallel. |
None
|
random_state
|
int | None
|
Controls the randomness of the estimator. |
None
|
verbose
|
int
|
Controls the verbosity when fitting and predicting. |
0
|
warm_start
|
bool
|
When set to True, reuse the solution of the previous call to fit. |
False
|
ccp_alpha
|
float
|
Complexity parameter used for Minimal Cost-Complexity Pruning. |
0.0
|
max_samples
|
int | float | None
|
If bootstrap is True, the number of samples to draw from X to train each base estimator. |
None
|
monotonic_cst
|
NDArray | None
|
Monotonicity constraints. |
None
|
callbacks
|
list[LearnerCallback] | LearnerCallback | None
|
Optional callbacks for progress feedback. Use |
None
|
Source code in src/flowcean/sklearn/random_forest.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 | |
learn(inputs, outputs)
Fit the random forest regressor on the given inputs and outputs.
Source code in src/flowcean/sklearn/random_forest.py
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 | |
RegressionTree(*, dot_graph_export_path=None, criterion='squared_error', splitter='best', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, random_state=None, max_leaf_nodes=None, min_impurity_decrease=0.0, ccp_alpha=0.0, monotonic_cst=None, callbacks=None)
Bases: SupervisedLearner
Wrapper class for sklearn's DecisionTreeRegressor.
Reference: https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html
Initialize the regression tree learner.
Reference: https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dot_graph_export_path
|
None | str
|
Path to export the decision tree graph in Graphviz DOT format. |
None
|
criterion
|
str
|
Function to measure the quality of a split. |
'squared_error'
|
splitter
|
str
|
Strategy used to choose the split at each node. |
'best'
|
max_depth
|
int | None
|
Maximum depth of the tree. |
None
|
min_samples_split
|
int
|
Minimum number of samples required to split an internal node. |
2
|
min_samples_leaf
|
int
|
Minimum number of samples required to be at a leaf node. |
1
|
min_weight_fraction_leaf
|
float
|
Minimum weighted fraction of the sum total of weights required to be at a leaf node. |
0.0
|
max_features
|
float | None
|
Number of features to consider when looking for the best split. |
None
|
random_state
|
int | None
|
Controls the randomness of the estimator. |
None
|
max_leaf_nodes
|
int | None
|
Grow a tree with max_leaf_nodes in best-first fashion. |
None
|
min_impurity_decrease
|
float
|
A node will be split if this split induces a decrease of the impurity greater than or equal to this value. |
0.0
|
ccp_alpha
|
float
|
Complexity parameter used for Minimal Cost-Complexity Pruning. |
0.0
|
monotonic_cst
|
NDArray | None
|
Monotonicity constraints. |
None
|
callbacks
|
list[LearnerCallback] | LearnerCallback | None
|
Optional callbacks for progress feedback. Use |
None
|
Source code in src/flowcean/sklearn/regression_tree.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 | |