🛢️ Time-series database

🛢️ Time-series database#

STAMM’s dedicated time-series data management system, implemented with InfluxDB 2.7, efficiently organizes and stores process information, sensor readings, actuator data, and soft sensor predictions. All data are structured in dedicated buckets, ensuring clear separation between raw process signals, model outputs, and metadata while maintaining a consistent schema for easy integration and analysis

🧱 Data Organization

Raw Data (stamm_raw): collects real-time measurements from sensors, actuators, and control systems. Represents direct observations from the physical process, such as temperature, pH, agitation speed, or flow rate. Each entry includes contextual tags like device ID, project name, and batch ID, enabling detailed traceability across experiments and production runs.
Model Predictions (stamm_predictions): stores outputs from machine-learning–based soft sensors — estimated variables and inferred values that are not directly measurable. Each prediction is linked to its corresponding model identifier and version, allowing model performance tracking and comparison over time. The structure mirrors that of the raw data, making it easy to align predictions with real measurements.
Metadata (stamm_metadata): contains descriptive information for all observed properties, such as engineering units, display names, and preferred precision. Acts as a central registry ensuring that every variable is consistently represented across dashboards and analytical tools.

🔄 Unified Data Model

All three buckets share a common measurement name (bioreactor_obs) and consistent tagging. This unified structure enables seamless queries and comparisons between real data, model predictions, and metadata — supporting cross-bucket analytics for monitoring, diagnostics, and soft-sensor validation.

Explore the source code and contribute on GitLab: