STAMM modules

Modules · Architecture overview

A modular architecture, built to extend

STAMM is built on a modular and extensible architecture designed to support scalable, reliable, real-time soft-sensor deployment in industrial environments. Each module is specialized for a single concern and communicates through clean APIs — so any piece can be updated, replaced, or integrated independently, without rewiring the whole system.

Independent modules

REST

Clean APIs between modules

Py · R

Languages supported

Open

Apache 2.0 licensed

System architecture

The diagram below illustrates how STAMM's modules connect — from the time-series database all the way to the operator's dashboard, with drift detectors monitoring the soft sensors in parallel.

STAMM architecture diagram — PostgreSQL time-series DB, Apache Airflow orchestrator, Model Registry, Drift Detectors, Plotly Dashboards

FIG · STAMM modular architecture and data flow

How the data flows

A single soft-sensor reading is the result of every module doing its job in order. Here's what happens between the physical process and the operator's screen.

01

🛢️Time-series DB holds reference & live data

Reference data (from historical batches) and live measurements from physical sensors and actuators land in STAMM's time-series database — InfluxDB today, with a PostgreSQL adapter being added so other backends can plug in under the same schema.

02

⚙️Workflow Orchestrator queries the DB

The Apache Airflow orchestrator continuously queries the time-series DB for fresh measurements and assembles snapshots of the current process state. Event-driven by design — soft sensors stay synchronized with whatever the process is doing right now.

03

🗂️Model Registry loads & serves the model

The Orchestrator calls the deployed soft sensor in the Model Registry over REST — no model code runs on the orchestration platform. The Registry loads the right model version and returns the prediction, fully traceable by model ID and version.

04

🛰️Drift Detectors evaluate the inputs

In parallel, the Drift Detectors compare the current input distribution against the reference window the model was trained on. They produce interpretable drift scores — univariate and multivariate — that tell whether the model is still working in familiar territory.

05

📊Dashboards surface everything to humans

The Plotly-based dashboards receive model predictions, reference and live data, and drift scores — and put them on one screen. Operators see trust signals, modelers see trends, and managers see operational risk in context.

The five modules in detail

Each module is owned by a different concern. Click through to the deep documentation for configuration, APIs, and examples.

🛢️

Time-series Database

The backbone for reference & live data.

A pluggable time-series store for raw measurements, model predictions, and variable metadata. InfluxDB is the production backend today; a PostgreSQL adapter is in progress so STAMM can plug into other databases under the same schema.

Reference & live dataHistorical batches alongside the current process state.
Pluggable backendsInfluxDB now, PostgreSQL adapter in progress.
Unified taggingDevice, project, and batch tags make traceability automatic.

InfluxDBPostgreSQLTime-series

Deep dive →

⚙️

Workflow Orchestrator

The conductor between modules.

Watches the time-series DB for fresh measurements, calls the Model Registry to score them, and writes predictions back — reproducibly, event-driven, end-to-end.

Event-driven DAGReacts to new data instead of polling on a fixed schedule.
Remote model callsTalks to the Model Registry over REST — nothing installed locally.
Timestamp-aligned outputsPredictions stay linked to the snapshot that produced them.

Apache AirflowPython

Deep dive →

🗂️

Model Registry

The home for every soft sensor.

Stores, versions, and serves every soft sensor in STAMM — together with its configuration, artifacts, validation results, and rich metadata describing architecture, training, and input/output variables.

Versioned lineageEvery retrain becomes a tracked version with full metadata.
Python & R supportSoft sensors in either ecosystem are first-class citizens.
REST inferenceModels served over HTTP — no model code on other modules.

PythonRRESTYAML

Deep dive →

🛰️

Drift Detectors

Tell when the model is guessing.

A standalone Python package — drift_detectors_pack — that compares current input distributions against the reference window the soft sensor was trained on. Produces interpretable drift scores so the dashboard, and the team, can spot when conditions move away from what the model knows.

Univariate & multivariateDetects shifts in single variables and in joint distributions.
Interpretable scoresDrift signals come with metadata, not just numbers.
Pluggable into pipelinesInstallable as a Python package — usable inside or outside STAMM.

Python10 detectorsOpen source

Deep dive →

📊

Dashboard

The operational window.

An interactive Plotly-based web app that brings together physical measurements, soft-sensor outputs, drift scores, and historical context — built for operators, modelers, and project leaders alike.

Live data + predictionsMeasured signals and soft-sensor outputs side by side.
Drift monitoringDrift scores from the detectors surfaced with context.
Simulation assessmentReplay batches and label outcomes for traceability.

PlotlyPython

Deep dive →

How the modules communicate

STAMM keeps the boundaries between modules deliberately thin and explicit. There are three communication patterns:

🔌

REST APIs

The Orchestrator calls the Model Registry over HTTP. Inference happens where the model lives, not on the orchestration platform.

🗃️

Shared database schema

Raw data, predictions, and metadata live in the same time-series DB with consistent tagging, so any module can read what it needs without coupling.

📜

Declarative metadata

Each model carries a YAML descriptor — architecture, inputs, outputs, training parameters — that the Registry, Orchestrator, Drift Detectors, and Dashboard all read from.

The result is a system where you can swap the database, replace the orchestrator, or rebuild the dashboard independently — provided each piece keeps its end of the contract.

🧩

Want to extend STAMM?Every module — including the Drift Detectors package — lives in its own repository on the STAMM GitHub organization. Apache 2.0 licensed, contributions welcome.

View on GitHub →

A modular architecture, built to extend

System architecture​

How the data flows​

🛢️Time-series DB holds reference & live data

⚙️Workflow Orchestrator queries the DB

🗂️Model Registry loads & serves the model

🛰️Drift Detectors evaluate the inputs

📊Dashboards surface everything to humans

The five modules in detail​

Time-series Database

Workflow Orchestrator

Model Registry

Drift Detectors

Dashboard

How the modules communicate​

System architecture

How the data flows

The five modules in detail

How the modules communicate