One-Tank Example
This example shows how to train two different models to predict the time-series behavior of a single water tank with a height-dependent outflow and a time-varying inflow. It is inspired by the one-tank simulink model used for control design 1. A schematic drawing of the system is shown in the following figure.
The complete source code for this example can be found in the repository. See also Run this example on how to run this example locally.
The dynamics of the system follow an ordinary differential equation (ODE). ODEs describe the derivative of a variable (e.g. it's change over time) as a function of the variable itself.
For this example the system is described by the equation
where \(x\) is the water level in the tank, \(\dot{x}\) is the change of the water level over time, \(V(t)\) is the time-dependent inflow, \(A\) is the tank area, and \(a\) and \(b\) are scaling constants for the equation. The solution of an ODE is not a single value, but a function (here \(x(t)\)) or a series of it's values for different times \(t\). As solving this ODE analytically is quite complicated, a numerical solver is used which compute solution points starting from an initial value. In this example, the initial value is the initial level of the liquid \(x(0) = x_0\) in the tank.
The free parameters from the above equation are set to
\(A\) | \(b\) | \(a\) |
---|---|---|
\(5\) | \(2\) | \(0.5\) |
The inflow is given by \(V(t) = \mathrm{max}\left(0, \sin\left( 2 \pi \frac{1}{10} t \right)\right)\) and the initial condition is \(x_0 = 1\). Using a suitable numerical solution algorithm, the equation can be solved for the level \(x_n\). Since the solution is not continuous, the level is not a function of time, but a discrete function of the sample number \(n\). The corresponding time can be calculated by multiplying the sample number \(n\) by the step size \(T\) between two samples. The graph below shows the development of the water level \(x\) from zero to ten seconds.
Learning Models
After setting up the simulation, we want to use two different learners to predict the level of the tank \(x[n]\) given the current input \(V[n]\) and the level and input in the previous two time steps. The unknown function we are looking for and that we want to learn is
To do this, we first need data to learn the functions representation in Flowcean.
Normally this data would be recorded from a real CPS and imported into the framework as a CSV, ROS bag or something similar.
However, since we know the differential equation describing the system behavior, we can also use this equation to generate data.
We can do this by using an ODEEnvironment
to model the ODE as an IncrementalEnvironment
within the framework.
To do so, a special OneTank
class is created which inherits from the general flowcean OdeSystem
class.
class OneTank(OdeSystem[TankState]):
def __init__(
self,
*,
area: float,
outflow_rate: float,
inflow_rate: float,
initial_t: float = 0,
initial_state: TankState,
) -> None:
super().__init__(
initial_t,
initial_state,
)
self.area = area
self.outflow_rate = outflow_rate
self.inflow_rate = inflow_rate
@override
def flow(
self,
t: float,
state: NDArray[np.float64],
) -> NDArray[np.float64]:
pump_voltage = np.max([0, np.sin(2 * np.pi * 1 / 10 * t)])
tank = TankState.from_numpy(state)
d_level = (
self.inflow_rate * pump_voltage
- self.outflow_rate * np.sqrt(tank.water_level)
) / self.area
return np.array([d_level])
The OneTank
class describes the differential equation in the flow
as well as all parameters needed to evaluate it.
The type parameter TankState
is used to map the general numpy array holding the current state of the simulation to a more tangible representation.
The state of an ODESystem
can also be used for systems with multiple states, where the behavior might change when certain conditions are met.
For this example with only a single state the TankState
class simply maps the water level in the tank to the first entry in the state vector
class TankState(State):
water_level: float
@override
def as_numpy(self) -> NDArray[np.float64]:
return np.array([self.water_level])
@classmethod
@override
def from_numpy(cls, state: NDArray[np.float64]) -> Self:
return cls(state[0])
The ODESystem
can now be constructed by creating an instance of the OneTank
class with the parameters given above
system = OneTank(
area=5,
outflow_rate=0.5,
inflow_rate=2,
initial_state=TankState(water_level=1),
)
From the system, an OdeEnvironment
can be constructed
data_incremental = OdeEnvironment(
system,
dt=0.1,
map_to_dataframe=lambda ts, xs: pl.DataFrame(
{
"t": ts,
"h": [x.water_level for x in xs],
},
),
).load()
Beside the ODE, the time resolution dt
and the mapping function map_to_dataframe
are passed to the constructor.
The mapping function describes how the generated solutions for different points in time can be mapped into a data frame for further processing within Flowcean.
The generated output of the OdeEnvironment
environment has the form
\(x\) | \(V\) |
---|---|
\(x[0]\) | \(V[0]\) |
\(x[1]\) | \(V[1]\) |
\(\dots\) | \(\dots\) |
\(x[N]\) | \(V[N]\) |
Since the learners we will use later only support learning on a fixed amount of data (called "offline learners" in the framework), we need to convert the incremental dataset into a fixed size dataset.
This can be done by calling the collect(N)
method on any IncrementalEnvironment
to get \(N\) samples and feed those into a Dataset
.
data = Dataset(data_incremental.load().take(250))
Until now, the data is in a time series format with each row representing a sampled value at the step \(n\).
However, for our prediction of the current fill level \(x[n]\), as described by the equation above, we need the current input \(V[n]\) and the values of the two previous time steps as a single sample.
To achieve this we use a SlidingWindow
transform.
See the linked documentation for a more detailed explanation of how the transform works.
data = data.with_transform(SlidingWindow(window_size=3))
Now that the data is in the correct format, it can be split into a test set with 80% of the samples and a training set with the remaining 20%.
This is done by using a TrainTestSplit
operation and helps with evaluating the learned models performance after training.
To make the learning less biased, the samples are shuffled before splitting.
train, test = TrainTestSplit(ratio=0.8, shuffle=True).split(data)
With the training data generated, fully transformed and split it's time to use learning algorithms to learn the prediction function from the beginning of this section. We use two different learners which both belong the category of offline learners.
First, a regression tree is used to learn a model.
The implementation of this learner is part of the scikit-learn library.
The learned model consists of a sequence of binary questions / comparisons that lead to the model result.
The maximum depth, i.e. the number of questions asked on each path, is limited to five.
The learner class is created and the helper method learn_offline
called to start the training process
regression_learner = RegressionTree(max_depth=5)
regression_model = learn_offline(
train,
regression_learner,
inputs,
outputs,
)
The inputs
and outputs
variables contain the names of the input and output fields in the `train' dataframe.
Secondly a multi-layer perceptron is used to create a model. This type of model consists of a set of neurones arranged in layers which are connected with the previous layer. The value of each neuron is calculated by weighting and summing up the values of the neurons in the previous layer and applying a non-linear function; in this case a leaky ReLU function. The result can be read from the neurons of the last layer. The implementation of this learner uses the lightning framework which is a high-level wrapper around the well known PyTorch library.
perceptron_learner = LightningLearner(
module=MultilayerPerceptron(
learning_rate=1e-3,
input_size=len(inputs),
output_size=len(outputs),
hidden_dimensions=[10, 10],
),
max_epochs=100,
)
perceptron_model = learn_offline(
train,
perceptron_learner,
inputs,
outputs,
)
The final step is to evaluate the obtained models.
This is done to estimate how well they are able to describe the unknown function as described above.
Flowcean ships with a couple of different metrics which can be used for this purpose.
Depending on the underlying problem, different metrics can be reasonable to apply.
For this example the MeanAbsoluteError
and MeanSquaredError
error are used.
These metrics are useful when the output of the learned function is a (more or less) continuous value and the deviation from the actual value is of interest.
The helper method evaluate_offline
allows for easy evaluation of multiple metrics for a learned model.
regression_report = evaluate_offline(
regression_model,
test,
inputs,
outputs,
[MeanAbsoluteError(), MeanSquaredError()],
)
perceptron_report = evaluate_offline(
perceptron_model,
test,
inputs,
outputs,
[MeanAbsoluteError(), MeanSquaredError()],
)
For this example, the resulting metrics are about
Learner typ | Runtime | Mean Absolute Error | Mean Squared Error |
---|---|---|---|
Regression Tree | \(15.5\: \mathrm{ms}\) | \(0.0206\) | \(0.0006\) |
Multi-layer Perceptron | \(813\: \mathrm{ms}\) | \(0.0639\) | \(0.00054\) |
Depending on the size of the dataset, the way the train and test set are split and shuffled, the learners configuration and other random facts, these values may vary. However, it is clear, that both learners produced models with relative small errors (\(\sim 2\%\) and \(\sim 6\%\)) which could be used for tasks such as prediction.
Run this example
To run this example first make sure you followed the installation instructions to setup python and just
.
Afterwards you can either use just
or run the examples from source.
Just
The easiest way to run this example is using just
.
Follow the installation guide to clone flowcean but stop before installing it or any of its dependencies.
Now you can run the example using
just examples-one_tank
This command will take care of installing any required dependencies in a separate environment. After a short moment you should see the learning results from both methods and the achieved metric values.
From source
Follow the installation guide to install flowcean and it's dependencies from source.
Afterwards you can navigate to the examples
folder and run the examples.
cd examples/one_tank
python run.py