ManuFast
0 / 8192

Run a prediction to see results here

Portfolio Project · Built from scratch

ManuFast: Small Language Model for Predictive Maintenance AI

Paste a structured industrial sensor log and get fault detection, remaining useful life, quality anomaly flags, and a plain-English diagnostic - in a single forward pass.

~30M params

lightweight

3 machine types

turbofan · bearing · CNC

4 simultaneous outputs

multi-task learning

The Core Idea

Industrial machines: jet engines, factory bearings, CNC cutting tools - degrade gradually over time. Mechanics and operators already document this degradation by writing maintenance logs describing sensor readings, anomalies, and operating conditions. ManuFast treats those logs as natural language and runs them through a transformer to extract actionable insights.

Unlike traditional statistical models that handle one task at a time, ManuFast performs four predictions simultaneously: whether a fault is present, how many operational cycles remain before failure, whether quality is within tolerance, and a plain-English explanation of what it found. One model, one pass, four outputs.

Sensor readings
Maintenance log
ManuFast model
Diagnosis

The Three Machines

ManuFast was trained on three real industrial datasets, each representing a different machine type with its own sensor signature and failure modes.

Turbofan Engine
CMAPSS Dataset

Jet engines used in commercial aircraft. 15 sensors monitor temperatures, pressures, and fan/core speeds across hundreds of flight cycles as the engine degrades.

Detects: HPC (compressor) degradation, fan degradation

Predicts: Remaining flight cycles until failure

Rolling Element Bearing
IMS Dataset (omitted) & CMAPSS Dataset

Bearings are the rings that allow rotating shafts to spin smoothly. When cracks develop on the inner or outer race, vibration patterns change in predictable ways.

Detects: Outer race, inner race, and ball element faults

Signals: RMS vibration, kurtosis, crest factor

CNC Machine
AI4I Dataset

Computer-controlled cutting tools used in precision manufacturing. Tool wear, excessive heat, and mechanical overstrain cause part defects and machine downtime.

Detects: Tool wear, heat dissipation, power, and overstrain failures

Signals: Temperature, rotational speed, torque

How the Model Works

A hybrid encoder-decoder transformer trained end-to-end with a multi-task loss.

// Encoder path

Raw text logBPE Tokenizer (7K vocab)8-layer Transformer Encoder[CLS] vector

// Classification heads (simultaneous)

├── Fault Head (binary classifier, 30% loss weight)

├── RUL Head (regression, 30% loss weight)

└── Quality Head (binary classifier, 20% loss weight)

// Decoder path

enc_out4-layer Autoregressive DecoderPlain-English diagnostic (20% loss weight)

Multi-task learning

One forward pass produces four outputs. Shared encoder representations improve all tasks simultaneously.

Weight tying

Encoder and decoder share their embedding matrices — saves ~2.7M parameters with no accuracy cost.

Custom BPE tokenizer

7,000-token vocabulary trained specifically on industrial sensor logs. No pretrained tokenizer used.

Training Results

Evaluated on held-out test sets from each dataset.

0.91

Fault F1

target: ≥ 0.80

CMAPSS & IMS

1.000

Quality F1

target: ≥ 0.85

AI4I dataset

6.5 Cycles

RUL RMSE

target: ≤ 20 Cycles

cycles (CMAPSS)

0.551

Summary BLEU

target: ≥ 0.50

diagnostic text

Training setup

Precision

BF16

Epochs

20

Batch size

64

Warmup steps

1,000

GPU

L4 (24 GB)

RAM

53 GB

Tech Stack

PyTorch

Model training and inference

Custom BPE Tokenizer

7K vocab, trained from scratch

CMAPSS Dataset

NASA Jet Engine Simulation data

NASA IMS Bearing Dataset

Rolling element bearing data

AI4I 2020 Dataset

CNC machine failure data

Built from scratch - No Hugging Face models used

Custom tokenizer, custom architecture, custom training loop.

A huge shoutout to our dataset providers and all the contributors who made this project possible. Dataset links mentioned in the footer below.