Data Quality Improvements
Date: October 15, 2025 Phase: Post-Import Enhancement
Following the successful import of 116,668+ bridge restrictions, three major data quality improvements were implemented to extract hidden data, identify gaps, and establish validation processes.
1. NHVR Property Extraction
Objective
Extract structured dimensional data (height, width, weight) from the nhvr_properties JSONB field where it was stored but not properly parsed during initial import.
Implementation
Discovery: Analyzed 100 sample NHVR records to identify extractable fields:
MinHeightClearance- Height data in metersStateTerritory- State/territory codesRoadName- Road namesRestrictionType- Restriction categories
Extraction Logic:
// Extract height clearance
if (!record.max_height_meters && props.MinHeightClearance) {
const height = parseFloat(props.MinHeightClearance);
if (!isNaN(height) && height > 0 && height < 50) {
update.max_height_meters = height;
// Auto-flag caravan hazards
if (height < 3.5) update.affects_caravans = true;
// Set severity levels
if (height < 3.0) update.severity = 'danger';
else if (height < 3.5) update.severity = 'caution';
}
}
Results
Processing Status:
- Total NHVR records: 101,333
- Records updated: 5,600+ (processing complete)
- Update rate: ~600-1,000 records per batch with extractable data
Impact:
- Significant increase in
max_height_meterspopulated fields - Improved
affects_caravansandseverityflagging - Better state/territory categorization
- Enhanced road name coverage
Script: scripts/extract-nhvr-properties.ts
Command:
npm run db:extract-nhvr-properties
2. Victoria Missing Heights Analysis
Objective
Investigate why 4,626 Victorian bridges (27.8%) lack height clearance data and identify strategies to fill gaps.
Findings
Overall Statistics:
- Total VIC bridges: 16,620
- With height data: 11,994 (72.2%)
- Missing height data: 4,626 (27.8%)
Missing Heights by Structure Type:
- Unknown: Majority of missing data
- Culverts: Many lack clearance (underground structures)
- Utility structures: Often don't require clearance measurements
- Road over road: Some highway overpasses missing data
- Rail over road: ⚠️ 45 critical overpasses without clearance data
Critical Data Gaps
High Priority:
- 45 rail overpasses missing height clearances
- Critical for caravan routing safety
- Rail overpasses typically have lower clearances (3.0-4.5m range)
Low Priority:
- Culverts and utility structures (underground, non-blocking)
- Pedestrian bridges (not vehicle restrictions)
- Historic structures (may not be on active routes)
Recommendations
-
Data Source Investigation:
- VicRoads Bridge Management System
- Transport for Victoria structure inspections
- OpenStreetMap maxheight tags (cross-reference available)
- Manual survey for 45 critical rail locations
-
Priority Actions:
- Filter for "RAIL OVER ROAD" bridges without heights
- Cross-reference with OSM data
- Mark low-priority structures as non-critical
- Schedule field verification for critical gaps
Script: scripts/analyze-vic-missing-heights.ts
Command:
npm run db:analyze-vic-gaps
See VIC Rail Overpasses for detailed action plan.
3. OpenStreetMap Validation
Objective
Cross-reference bridge height data with OpenStreetMap's community-maintained maxheight tags to validate accuracy and identify supplementary data sources.
Implementation
Query Design: Uses Overpass API to query OSM data within geographic bounds.
Height Parsing: Handles multiple OSM maxheight formats:
- Metric: "3.8" → 3.8m
- Feet: "12'6"" → 3.81m
- Mixed: "4.5m" → 4.5m
Matching Logic:
- Geographic proximity: ±0.001° bounding box (~111m radius)
- Discrepancy threshold: >0.2m difference flagged
- Returns closest match within radius
Test Results
Test Region: Melbourne Metro Area
- 617 bridges with maxheight tags found in Melbourne
- Strong community coverage in urban areas
- Potential supplementary data source
Recommendations
OSM as Supplementary Source:
Advantages:
- Extensive community coverage (617+ bridges in Melbourne alone)
- Regular updates by local contributors
- Covers local roads not in state datasets
Limitations:
- Community-maintained (varying accuracy)
- May be outdated (no inspection dates)
- Should be flagged as lower confidence than official sources
Use Cases:
- Fill gaps for VIC's 45 missing rail overpasses
- Supplement local road coverage
- Validate suspicious official data
Discrepancy Handling:
- Prefer official government sources for state highways
- Investigate differences >0.5m
- Flag uncertain data with
data_source: 'osm_unverified' - Cross-reference multiple sources when available
Script: scripts/validate-osm-crossref.ts
Command:
npm run db:validate-osm
See OSM Integration for implementation guide.
Summary Statistics
Before Data Quality Improvements
- Total restrictions: 116,668
- NHVR records with structured height: ~0%
- VIC rail overpasses unidentified: Unknown
- OSM validation: Not performed
After Data Quality Improvements
- Total restrictions: 116,668 (unchanged)
- NHVR records with extracted height: 5,600+
- VIC critical gaps identified: 45 rail overpasses
- OSM bridges available: 617+ (Melbourne alone)
Estimated Coverage Improvement
Current:
- Additional height clearances extracted: 5,000-10,000
- Improved caravan safety flagging: 2,000-5,000 records
- Identified priority data collection targets: 45 structures
With OSM Integration (Future):
- Potential additional bridges: 10,000+ nationwide
- Enhanced local road coverage
- Validation dataset for quality assurance
Next Steps
Immediate Actions
- ✅ Complete NHVR property extraction
- ⚠️ Focus data collection on 45 VIC rail overpasses
- 🔄 Consider OSM import pilot for VIC gaps
Short-term Enhancements (1-2 weeks)
- Add
data_confidencefield to schema - Implement OSM import script for missing VIC rail bridges
- Create validation reports comparing OSM vs official data
- Add data source tracking for all records
Long-term Strategy (1-3 months)
- Schedule regular OSM sync (monthly)
- Implement user-reported data corrections
- Partner with state authorities for official data updates
- Create data quality dashboard showing coverage by region
Related Documents
- Overview - Complete implementation status
- State Data Summary - Data availability by state
- VIC Rail Overpasses - Action plan for missing heights
- OSM Integration - OpenStreetMap cross-reference guide