VERIFIED
559,010 data points compared exhaustively against BLS source Excel files with
0 discrepancies. Both OEWS regional employment data (553,192 values) and national
Employment Projections (5,818 values) match their source files exactly.
559,010
Data Points Verified
Data Sources Validated
| Dataset |
BLS Source File |
Data Points |
Match Rate |
Status |
| OEWS Regional Wages & Employment |
MSA_M2024_dl.xlsx |
553,192 |
100.0% |
✓ |
| National Employment Projections |
occupation_1_.xlsx |
5,818 |
100.0% |
✓ |
Validation Methodology
1. OEWS Regional Data
BLS OEWS Excel
MSA_M2024_dl.xlsx
→
Row-level Match
Area × SOC code
→
553,192 Verified
0 discrepancies
1
OEWS — Exhaustive Source Comparison
The complete BLS Occupational Employment and Wage Statistics file
MSA_M2024_dl.xlsx was downloaded from
bls.gov/oes/tables.htm (May 2024 release).
Every row with AREA_TYPE = '2' (metropolitan)
and O_GROUP = 'detailed' (141,164 rows) was compared against
the corresponding entry in our deployed oews_msa_minimal.json,
keyed by area code + SOC code. Four fields were compared per row:
TOT_EMP,
A_MEDIAN,
LOC_QUOTIENT, and
JOBS_1000.
An additional 9,012 summary-level occupation groups (e.g., "11-0000 Management Occupations")
are present in the JSON for completeness but are never displayed in individual reports.
2
Result: 553,192 of 553,192 Values Verified
Four fields × 138,298 occupation-metro pairs = 553,192 individual values compared.
Every value in the backend matches the BLS source file exactly. Zero discrepancies found.
An additional 11,464 null-pair values (both source and system null due to BLS suppression)
were confirmed as expected.
2. National Employment Projections
BLS EP Table 1.2
occupation_1_.xlsx
→
Row-level Match
832 SOC codes
→
5,818 Verified
17 top-coded*
3
National Projections — Exhaustive Source Comparison
The 832 occupations in bls_projections_2024_2034.json were
compared row-by-row against BLS Employment Projections Table 1.2
(occupation_1_.xlsx), downloaded from
bls.gov/emp.
Seven fields were compared per occupation:
employment_2024, employment_2034,
change_numeric, change_percent,
annual_openings, median_wage, and
education.
Of 5,818 values compared, 5,801 were exact matches. The 17 differences are all
high-wage medical specialties (surgeons, anesthesiologists, etc.) where BLS reports
"≥ $239,200" — the Excel cell is blank, and our system correctly stores the BLS
top-code value of $239,200. This is standard BLS wage top-code handling.
Additionally, mathematical consistency was verified:
employment_2024 + change_numeric ≈ employment_2034 held
for all 832 occupations with zero errors, and all 832 outlook labels matched the
published growth-rate thresholds.
4
Cross-Dataset Plausibility
National median wages from the EP projections were compared against the regional OEWS
distribution for 763 occupations with data in 10+ metros. For all 763, the national wage
fell within the regional p5–p95 range, confirming cross-dataset consistency.
Sample Verification Detail
Both datasets were validated exhaustively by automated comparison. The following occupations
were additionally verified by manual lookup on bls.gov/ooh
on February 13, 2026:
| SOC |
Occupation |
BLS Median Wage |
Our Value |
Growth |
Result |
| 31-9092 | Medical Assistants | $44,200 | $44,200 | +12.5% | ✓ |
| 29-1141 | Registered Nurses | $93,600 | $93,600 | +4.9% | ✓ |
| 43-4051 | Customer Service Reps | $42,830 | $42,830 | −5.5% | ✓ |
| 29-2061 | LPN/LVN | $62,340 | $62,340 | +2.6% | ✓ |
| 15-1252 | Software Developers | $133,080 | $133,080 | +15.8% | ✓ |
Regional Verification
| Metro Area |
SOC |
Field |
BLS Source |
Our Value |
Result |
| Raleigh-Cary, NC | 43-4051 | Median Wage | $41,020 | $41,020 | ✓ |
| Raleigh-Cary, NC | 43-4051 | Employment | 14,770 | 14,770 | ✓ |
| Charlotte, NC-SC | 31-9092 | Median Wage | $45,330 | $45,330 | ✓ |
| Charlotte, NC-SC | 31-9092 | Employment | 6,890 | 6,890 | ✓ |
| Raleigh-Cary, NC | 31-9092 | Median Wage | $42,900 | $42,900 | ✓ |
How to Verify
Any data point in a Semantic Insight report can be independently verified:
Regional data (wages, employment by metro): Download the OEWS dataset from
bls.gov/oes/tables.htm → "Metropolitan and
nonmetropolitan area" → select the May 2024 release. Open the Excel file and filter by
your report's area code (shown in report metadata) and SOC code.
National projections (growth outlook, openings): Download Table 1.2 from
bls.gov/emp
(the "Occupation" XLSX file). Filter by SOC code. The employment, growth rate, openings, and wage
columns will match the figures in your report. Alternatively, visit
bls.gov/ooh and search by occupation title.
Notes & Known Limitations
Wage top-coding. BLS does not publish exact median wages for occupations
earning above $239,200 per year. The source Excel leaves these cells blank. Our system stores
the BLS standard top-code value of $239,200 for these 17 occupations (all medical specialists:
surgeons, anesthesiologists, cardiologists, etc.). This is the standard handling recommended
by BLS and used by all systems that consume this data.
Two data sources, two survey methodologies. National projections
(Employment Projections program) and regional data (OEWS survey) are produced by different
BLS programs. National median wages may differ slightly from the weighted average of all metro
medians. For Software Developers (15-1252), the national median of $133,080 is slightly above
the OEWS metro 90th percentile of $131,050 — this reflects the impact of high-paying tech metros
and non-metro employment, and is consistent with BLS methodology.
Suppressed data. BLS suppresses values in some metro × occupation cells to
protect employer confidentiality. These appear as null values in both the source Excel and our
system. The audit found 11,464 null-pair values (both source and system null), which is expected.
Growth percentage rounding. The EP table publishes change_percent
independently from the rounded employment figures. For 161 of 832 occupations, recalculating
the percentage from the rounded employment figures produces a slightly different value (median
difference: 0.4 percentage points). Our system uses the BLS-published percentage, which was
verified to exactly match the source Excel for all 832 occupations.
Data Currency & Refresh Schedule
| Dataset |
Current Version |
BLS Publication Cycle |
Next Expected Release |
| OEWS Regional Wages | May 2024 | Annual (spring) | ~April 2026 (May 2025 data) |
| Employment Projections | 2024–2034 | Every 2 years | ~2027 (2026–2036 projections) |
When BLS publishes updated data, we re-run this validation process against the new source files
before deploying updated data to reports. The automated validator
(bls_data_validator.py) is included in the CI pipeline to catch any
discrepancies during the update process.