dataframe
DataFrame(data, *, data_hash=None)
Bases: OfflineEnvironment
A dataset environment.
This environment represents static tabular datasets.
Attributes:
Name | Type | Description |
---|---|---|
data |
LazyFrame
|
The data to represent. |
Initialize the dataset environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
DataFrame | LazyFrame
|
The data to represent. |
required |
data_hash
|
bytes | None
|
The hash of the data. If None, it will be computed from the dataframe which is potentially slow and expensive. |
None
|
Source code in src/flowcean/polars/environments/dataframe.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
|
to_incremental(batch_size=1)
Convert the DataFrame to an incremental environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
batch_size
|
int
|
The size of each batch. Defaults to 1. |
1
|
Source code in src/flowcean/polars/environments/dataframe.py
55 56 57 58 59 60 61 62 63 64 |
|
from_csv(path, separator=',')
classmethod
Load a dataset from a CSV file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str | Path
|
Path to the CSV file. |
required |
separator
|
str
|
Value separator. Defaults to ",". |
','
|
Source code in src/flowcean/polars/environments/dataframe.py
66 67 68 69 70 71 72 73 74 75 76 |
|
from_json(path)
classmethod
Load a dataset from a JSON file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str | Path
|
Path to the JSON file. |
required |
Source code in src/flowcean/polars/environments/dataframe.py
78 79 80 81 82 83 84 85 86 |
|
from_parquet(path)
classmethod
Load a dataset from a Parquet file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str | Path
|
Path to the Parquet file. |
required |
Source code in src/flowcean/polars/environments/dataframe.py
88 89 90 91 92 93 94 95 96 |
|
from_yaml(path)
classmethod
Load a dataset from a YAML file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str | Path
|
Path to the YAML file. |
required |
Source code in src/flowcean/polars/environments/dataframe.py
98 99 100 101 102 103 104 105 106 |
|
from_uri(uri)
classmethod
Load a dataset from a URI.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
uri
|
str
|
The URI to load the dataset from. |
required |
Source code in src/flowcean/polars/environments/dataframe.py
108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
|
__len__()
Return the number of samples in the dataset.
Source code in src/flowcean/polars/environments/dataframe.py
139 140 141 142 143 144 145 146 147 |
|
InvalidUriSchemeError(scheme)
Bases: Exception
Exception raised when an URI scheme is invalid.
Initialize the InvalidUriSchemeError.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
scheme
|
str
|
Invalid URI scheme. |
required |
Source code in src/flowcean/polars/environments/dataframe.py
206 207 208 209 210 211 212 213 214 |
|
UnsupportedFileTypeError(suffix)
Bases: Exception
Exception raised when a file type is not supported.
Initialize the UnsupportedFileTypeError.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
suffix
|
str
|
File type suffix. |
required |
Source code in src/flowcean/polars/environments/dataframe.py
220 221 222 223 224 225 226 |
|
collect(environment, n=None, *, progress_bar=True)
Collect data from an environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
environment
|
Iterable[LazyFrame] | Collection[LazyFrame]
|
The environment to collect data from. |
required |
n
|
int | None
|
Number of samples to collect. If None, all samples are collected. |
None
|
progress_bar
|
bool | dict[str, Any]
|
Whether to show a progress bar. If a dictionary is provided, it will be passed to the progress bar. |
True
|
Returns:
Type | Description |
---|---|
DataFrame
|
The collected dataset. |
Source code in src/flowcean/polars/environments/dataframe.py
229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 |
|