Machine Learning · Case Study

Real-Time Material Flow Estimation

A custom neural network running on embedded hardware, counting discrete units with sub-1% error at production flow rates.

< 1%

Count error at optimal flow rate

3.2 u/s

Theoretical RMSE after temporal smoothing

100%

Edge inference - no cloud dependency

When a machine dispenses thousands of small items per minute, knowing exactly how many have passed through is critical - for dosing accuracy, quality control, and process traceability. The problem is that at realistic operating speeds, items overlap and cluster, making a simple direct count unreliable. A smarter approach is needed.

We built a system that continuously analyses the machine's sensor signals and estimates flow rate in real time, accumulating an accurate total count without ever interrupting the process. It runs directly on the machine's embedded hardware - no cloud connection, no latency - and achieves errors below 5% across the primary operating range.

The Challenge

Due to a non-disclosure agreement, we are unable to share further details about the specific application or customer.

The hardware includes multiple analog sensors along a dispensing channel. As discrete units pass by, each sensor produces a continuous signal - but at higher flow rates, units overlap and interfere with one another, making direct pulse-counting strategies unreliable.

The core problem: infer instantaneous flow rate (units per second) from multi-channel signals, then integrate that estimate over time into a total count. The system must run in real time on an embedded device with no cloud connectivity.

Why Standard Regression Fails

Data exploration revealed a critical non-linearity: at low flow rates, units pass individually (discrete regime) and signals scale predictably. At higher flow rates, units overlap and saturate the sensors (chaotic regime). This physical phase transition makes standard regression models insufficient.

Dimensionality reduction confirmed the non-linear structure: PCA captured only ~30% of variance, while t-SNE revealed two clearly separated clusters corresponding to the two physical regimes. A non-linear model was needed.

Multi-channel sensor signal waveform showing discrete and chaotic flow regimes
Raw output from the analog sensors, as seen by the system. Superposition effects are clearly visible: multiple units in transit overlap in the signal, making naive peak-based detection unreliable for accurate counting.

Our Approach

Data Pipeline

Raw multi-channel signals were acquired at ~20 kHz. After downsampling and efficient format conversion, we achieved a 93% reduction in data volume (40 GB → 2.9 GB) with no loss of task-relevant information. A sliding window strategy (250 ms windows, 50 ms hop, 80% overlap) generates a smooth, real-time flow rate estimate suitable for edge deployment.

Feature Engineering

We extract features from three complementary domains:

Time-Domain

  • Normalized pulse count and duty cycle
  • Pulse duration and inter-pulse interval statistics
  • Signal energy: RMS and total integrated area

Frequency-Domain

  • Spectral centroid and bandwidth
  • Power in defined frequency bands
  • Mel-frequency cepstral coefficients (MFCCs)

Cross-Channel

  • Maximum cross-correlation between channel pairs
  • Lag at peak correlation and spectral coherence
  • Simultaneity count across channels

Permutation importance analysis confirmed that energy-based features - Total Integrated Area and RMS - are the most predictive of flow rate across both operating regimes.

Model prediction compared against the physical limits of the sensor
Model prediction compared against the physical limits of the sensor. When the curve stays within the dotted bands, the algorithm is operating optimally - tracking the true flow rate as closely as the physics of the measurement allows.

Model Architecture

Given the multi-regime physics, a fully-connected Artificial Neural Network (ANN) was chosen over classical regression. Architecture and hyperparameters were optimised using KerasTuner with the Hyperband strategy - an efficient search method that explores a wide design space while early-stopping under-performing configurations. The final model converged smoothly without overfitting.

From Rate to Count

Instantaneous flow rate estimates are integrated using a Riemann sum, with careful handling of the hop size to avoid double-counting from overlapping windows. A 2-second moving average smoothing filter reduces theoretical RMSE from ~20 units/s (raw ANN output) to ~3.2 units/s - a 6× improvement.

Results

The model was validated across the full operating range of the machine:

Flow Rate Count Error
Low (~80 units/s) +10.2%
Medium (~200 units/s) +4.6%
High (~300 units/s) −0.8%
Very high (~400 units/s) −12.1%
Predicted flow rate versus actual ground-truth flow rate across the operating range
Predicted flow rate versus the actual ground-truth flow rate across the operating range. The close agreement between the two curves confirms the model's accuracy in real working conditions.

The model performs optimally in the medium-to-high range (~200–350 units/s), which is the primary operating range of the machine. At very high flow rates, systematic underestimation occurs due to signal saturation - a fundamental physical constraint.

At the optimal operating point, the system reaches errors below 1% - compared to approximately 5% for best-in-class alternatives.

The instantaneous RMSE of ~20 units/s is close to the Poisson noise floor of ~28 units/s - the irreducible variance inherent in any random discrete flow process. This confirms the model is extracting nearly all available information from the signal.

Key Takeaways

Edge Deployment

The trained model runs entirely on embedded hardware in real time, with no cloud dependency. This enables use in field conditions and removes any latency from remote inference.

Physics-Informed Design

Understanding the physical phase transition between flow regimes directly informed our model choice and feature engineering strategy - avoiding the pitfalls of fitting a linear model to a fundamentally non-linear system.

Near-Optimal Accuracy

The system operates close to the fundamental Poisson noise floor of the physical process, confirming that the feature set and architecture are well-matched to the problem.

Generalizable Methodology

Multi-channel feature extraction + ANN regression + temporal integration is directly applicable to any discrete unit flow estimation problem: granules, pellets, components, tablets - across industrial and agricultural contexts.

Got a similar challenge?

We specialise in edge machine learning for demanding industrial applications. Get in touch to discuss your project.

Contact Us