Service Orchestrator
⚙️ Workflow Orchestrator
The operational engine that coordinates data acquisition, model execution, and prediction storage in the soft-sensor workflow — reproducibly, event-driven, end-to-end.
What it does
STAMM uses Apache Airflow 3.0.6 (Python 3.12.11) in its official Docker image as the operational engine of the soft-sensor workflow. The time-series database holds raw measurements, metadata, and model predictions; Airflow makes sure these heterogeneous data flows run in a temporally consistent and fully automated way. A dedicated DAG (STAMM_DAG) drives the process in real time.
The four DAG steps
🩺Health Check
Confirms connectivity with the time-series database and ensures the main buckets — stamm_raw, stamm_predictions, stamm_metadata — are available before anything else runs.
📡Data Detection
Monitors stamm_raw for new measurements and assembles wide-format snapshots that capture the latest process state — already shaped for ML model input.
🗂️Model Inference
Airflow calls the Model Registry via REST endpoints to score the snapshot — no model code is installed on the platform. Inference happens where the model lives.
💾Prediction Storage
Predictions land in stamm_predictions, preserving the original snapshot timestamp and linking each output to the experimental observation that produced it. Every entry carries full metadata — model ID, version, source, and predicted property — so cross-bucket queries and dashboards stay coherent.
Why it matters
This integration of database, Model Registry, and workflow orchestration gives STAMM a reproducible, event-driven, and extensible execution layer. Soft-sensor models stay synchronized with data availability, version-controlled through the Model Registry, and seamlessly integrated into the unified time-series schema.