⚙️ Service Orchestrator

⚙️ Service Orchestrator#

STAMM uses Apache Airflow 3.0.6 (Python 3.12.11) in its official Docker image as the operational engine to coordinate data acquisition, model execution, and prediction storage in the soft sensor workflow. InfluxDB serves as the time-series database for raw measurements, metadata, and model predictions, while Airflow ensures these heterogeneous data flows run in a temporally consistent and fully automated way. A dedicated DAG (STAMM_DAG) orchestrates this process in real time, consisting of four main steps:

Health Check: Confirms connectivity with InfluxDB and ensures the main buckets (stamm_raw, stamm_predictions, and stamm_metadata) are available.
Data Detection: Monitors stamm_raw for new measurements and creates wide-format “snapshots” that capture the latest process state, ready for ML models.
Model Inference: Airflow interacts with the Model Registry via REST endpoints, avoiding the need to install models directly on the platform.
Prediction Storage: All predictions are stored in the stamm_predictions bucket, preserving the original snapshot timestamp and linking soft-sensor outputs with their corresponding experimental observations. Each entry includes full metadata—model ID, version, source, and predicted property—ensuring compatibility with cross-bucket queries and dashboards.

This integration of database, model registry, and workflow orchestration gives STAMM a reproducible, event-driven, and extensible execution layer. Soft-sensor models stay synchronized with data availability, version-controlled through the Model Registry, and seamlessly integrated into the unified InfluxDB schema.

Explore the source code and contribute on GitLab: