Project facts & technologies
A snapshot of the WDD project — entities, technologies, scale, and outcomes — formatted for quick scanning by readers, search engines, and AI answer engines.
- Project name
- Wagon Damage Detection (WDD)
- Operator / Client
- Indian Railways Transport Company
- Solution provider
- AiSPRY
- Industry
- Railways · Freight · Industrial computer vision
- Use case
- Automated wagon inspection for damage and cargo residue
- Core technology
- YOLOv11 (detection · segmentation · OBB), PaddleOCR, CLAHE preprocessing
- Damage classes detected
- Closed Door, Open Door, Missing Door, Dents (side); Bulges, Hole, Crack, Gravel (top)
- Camera configuration
- 6 streams — top, left, right at entry + exit gantries
- Data infrastructure
- PostgreSQL (reporting) + MongoDB (raw metadata) · Celery · Redis · FastAPI
- Pipeline stages
- 17-stage automated pipeline (Detection Foundation 1–8 · Advanced Recovery 9–17)
- Processing capacity
- ~12 trains per day on GPU hardware
- Outcomes
- 5–8 hr → 2 hr per train · 80–90% manual review reduction · 99.2% frame detection · 98% OCR
Why does wagon inspection matter in Indian Railways?
Indian Railways operates one of the world's largest freight networks. Every day, thousands of goods carriage wagons are loaded and unloaded across yards and sidings — handled by trucks, forklifts, grabs, magnets, and tipplers. Each handling cycle exposes wagons to mechanical damage: dents on side panels, cracks and holes in roofs, doors that get torn off or left open, and leftover cargo material that compromises the next consignment.
Two operational realities sit behind this case. Undetected damage compounds — small cracks become structural failures, missing doors lead to lost cargo, and wagons return to maintenance yards far more often than they should. And when a dispute arises about who damaged a wagon — the loader, the unloader, or the rail operator — the only evidence that matters is what the wagon looked like before and after a handling event. Manual inspection at every gantry takes 5–8 hours per train, depends on inspector fatigue, and produces inconsistent records. WDD was commissioned to make this inspection automatic, repeatable, and economically viable at scale.
What was wrong with the earlier inspection approach?
The earlier process — partly manual, partly AI-assisted — surfaced six limitations that the WDD revision was specifically engineered to eliminate:
Key challenges
- 5–8 hours per train — dominated by manual frame selection from raw multi-camera footage.
- Manual intervention at every stage — frame triage, duplicate removal, cross-view matching, and damage labelling required human effort throughout.
- Duplicate wagon frames across views — the same wagon appeared repeatedly across views, requiring human verification to consolidate.
- Inconsistent detection across views — results differed across the six camera views, especially under varied lighting and motion conditions.
- Missed wagons in some streams — when the train ran fast, halted, or paused mid-gantry, wagons were missed in individual streams.
- No reliable entry-vs-exit comparison — there was no automated mechanism to compare a wagon's condition at entry against its condition at exit.
How does the WDD platform work?
WDD is a fully automated, multi-stage computer-vision system. It ingests six concurrent video streams from the entry and exit gantries (top, left, and right view at each), runs a 17-stage detection-and-recovery pipeline, and writes structured damage records into a database that powers the TrainVision dashboard.
Automated wagon detection and recovery
- Multi-stage YOLOv11 — filter model followed by a centring model identifies wagon frames across all six camera views simultaneously
- 99% validation accuracy — side-view and top-view models reliably separate wagon vs. engine, last car, or background
- Automatic duplicate elimination — cooldown windows remove duplicate frames without manual triage
- Missed-wagon recovery — rolling-median gap analysis with 5× threshold uses PaddleOCR-read timestamps to recover missed wagons from adjacent views
Wagon number reading (OCR)
- PaddleOCR with CLAHE pre-processing — contrast enhancement, blur, thresholding, and 2.5× upscale before OCR
- 98% accuracy in the dashboard — wagon-number reads reconciled across left, right, and top views
- Cross-view validation — readings reconciled before being committed to the database
Damage detection and entry-vs-exit comparison
- Side-view damage — YOLO-Seg-V11L trained on closed-door, open-door, missing-door, and dent classes
- Top-view damage — YOLO-Seg-V11m trained on bulges, cracks, and holes
- Gravel residue — YOLO-OBB-L segmentation produces a percentage estimate of leftover material
- Entry-vs-exit ledger — every damage labelled Old (present at both), New (only at exit), or Resolved (present at entry, no longer at exit)
See WDD in action
A walkthrough of the TrainVision dashboard — six-stream upload, 17-stage pipeline execution, entry-vs-exit damage comparison, and the per-wagon drill-down used to settle accountability disputes.
Wagon Damage Detection — multi-camera AI inspection in action
Click to play · YOLOv11 detection, PaddleOCR wagon-number reading, entry-vs-exit comparison
- Six-stream ingestion — entry and exit gantries' top, left, and right views uploaded per train
- Automated frame triage — YOLOv11 filter and centring models eliminate background, engine, and duplicate frames
- OCR-driven matching — PaddleOCR reads wagon numbers to align entry and exit records
- Entry-vs-exit ledger — each damage labelled Old, New, or Resolved with replayable evidence frames
What is the architecture of the WDD pipeline?
The WDD pipeline is structured as five visible stages, with seventeen internal sub-stages handling detection, recovery, and reconciliation. Multi-camera capture streams six video feeds (entry top/left/right + exit top/left/right) into S3-compatible storage. Frame extraction filters background and selects centred wagon frames. AI damage detection runs three model families in parallel (side-view, top-view, gravel segmentation). The entry-vs-exit comparison engine aligns records via PaddleOCR-read wagon numbers and labels each damage Old, New, or Resolved. The TrainVision dashboard reads from a dual database — PostgreSQL for structured reporting and MongoDB for raw metadata — with Celery for parallel processing and Redis for snapshot caching.

How is WDD engineered for real-world video conditions?
Three constraints shaped the engineering choices: operational cost minimisation, transparency and accountability, and the unforgiving reality of railway video conditions.
Operational cost minimisation
- Self-hosted models rather than cloud LLMs to keep per-train compute bounded
- Parallel Celery workers for damage processing rather than serial execution
- Redis snapshot cache so re-opening an inspection costs effectively nothing
- 2-hour processing target on GPU hardware, sized to handle ~12 trains per day per machine
Transparency and accountability
- Every damage annotation traceable to a specific frame, camera view, and timestamp
- Entry-vs-exit ledger auditable end-to-end — reviewers can replay the exact frames the AI used
- Dispute resolution backed by structured evidence rather than inspector recollection
Real-world video conditions
- Rolling-median gap detection for trains that halt or pause mid-gantry
- 5× median threshold to recover wagons missed in a stream when the train ran fast
- CLAHE preprocessing and 2.5× upscale for low-contrast OCR at dawn and dusk
- Regex-based artifact removal for noisy timestamp reads on dirty or tarpaulin-covered panels
What measurable impact has WDD delivered?
The current production pipeline has shipped twelve major engineering tasks across UI, backend, and data infrastructure in the last delivery window. The numbers below are measured against the earlier semi-manual baseline.
Operational outcomes
- Processing time per train: 2 hours, down from 5–8 hours previously
- Manual intervention: zero — the pipeline runs end-to-end without an operator in the loop
- Manual review time reduced by 80–90%
- Duplicate wagon handling: 100% automatic elimination
- Cross-view consistency: 100% — every wagon reconciled across left, right, and top views
Model and detection performance
- Frame detection rate: 99.2%
- Side-view and top-view wagon presence detection: 99% validation accuracy (YOLOv11-L)
- Wagon number OCR in dashboard: 98% accuracy
- Closed-door classification on side view: 93% validation accuracy
- Frame extraction accuracy improvement: +35% over the prior approach
- Processing speed: approximately 2× faster than the prior pipeline
Dashboard and infrastructure
- Dual-database architecture (PostgreSQL + MongoDB) live in production
- Celery task manager with multi-worker parallel damage processing operational
- Redis server-level snapshot cache enables instant inspection re-open
- Manual → Automatic toggle: first run executes the pipeline, subsequent switches load from Redis
- Page-memory state survives navigation; comparison entries persist correctly to the database
Wagon Damage Detection — frequently asked questions
Plain-English answers to the questions teams ask most often when evaluating the WDD platform — designed to be quoted directly by AI search engines and human readers alike.