VERIFIED
141,999 data points compared against federal source files with
0 discrepancies. Both IPEDS program completions (121,120 records)
and College Scorecard institutional outcomes (20,879 data points) match their
respective source files exactly.
141,999
Total Data Points Verified
5,141
Institutions Checked
Data Sources Validated
| Dataset |
Source File |
Data Points |
Match Rate |
Status |
| IPEDS Completions |
C2024_A.csv |
121,120 |
100.0% |
✓ |
| College Scorecard |
Most-Recent-Cohorts-Institution.csv |
20,879 |
100.0% |
✓ |
Part 1 — IPEDS Program Completions
Every program-level completions record in the Semantic Insight backend was compared
against the NCES Completions survey file C2024_A — the
authoritative federal source for postsecondary program completions data. This file
contains 307,707 rows covering awards conferred from July 1, 2023 through June 30, 2024,
across 6,429 institutions. Downloaded from
nces.ed.gov.
NCES C2024_A307,707 rows
→
Comparison EngineUNITID × CIP code
→
121,120 Verified0 discrepancies
121,120
Program Records Compared
4,918
Institutions Verified
Methodology
1
IPEDS Completions — Exhaustive Source Comparison
The NCES source file reports at the (UNITID, CIPCODE, AWLEVEL, MAJORNUM) grain.
Completions were aggregated across award levels and major number, then every row
in ipeds_programs_6digit.json (keyed by
UNITID + 6-digit CIP code) was matched against the NCES source. Four fields per
record were compared: UNITID,
CIPCODE, CTOTALT (total completions),
and AWLEVEL (award levels present).
2
Results — 121,120 of 121,120 Records Verified
Of the 121,120 backend records: 106,076 (87.6%) were exact matches against NCES
grand totals. The remaining 15,044 (12.4%) differ because the backend uses
MAJORNUM=1 only (first-major students), excluding
double-major second-major counts. In every case, the backend value exactly equals
the NCES MAJORNUM=1 subtotal. Zero unexplained discrepancies.
Completions Verification Breakdown
| Category | Records | Percentage | Status |
| Exact match (all completions) | 106,076 | 87.6% | ✓ |
| Match on MAJORNUM=1 (first-major only) | 15,044 | 12.4% | ✓ |
| Unexplained discrepancies | 0 | 0.0% | ✓ |
| Total | 121,120 | 100.0% | ✓ |
Part 2 — College Scorecard
Five institutional outcome fields were compared against the U.S. Department of Education
College Scorecard "Most Recent Institution-Level Data" file (6,429 institutions, last updated
November 17, 2025). Downloaded from
collegescorecard.ed.gov/data.
College Scorecard6,429 institutions
→
Field-level Match5 fields × UNITID
→
20,879 Verified0 discrepancies
Field Mapping & Results
| Backend Field | Scorecard Column | Description | Verified | Status |
| completion_rate | C150_4 / C150_L4 | Graduation rate (150% normal time) | 4,313 | ✓ |
| median_debt | GRAD_DEBT_MDN | Median debt at graduation | 3,824 | ✓ |
| employment_rate | COUNT_WNE_P10 / COUNT_NWNE_P10 | Employment rate, 10yr post-entry | 4,004 | ✓ |
| enrollment | UGDS | Undergraduate enrollment | 4,734 | ✓ |
| median_earnings | MD_EARN_WNE_P10 | Median earnings, 10yr post-entry | 4,004 | ✓ |
| Total data points verified | 20,879 | ✓ |
Employment rate methodology: The College Scorecard does not publish a single
"employment rate" field. The backend calculates it from
COUNT_WNE_P10 (working, not enrolled, 10yr post-entry)
and COUNT_NWNE_P10 (not working, not enrolled):
employment_rate = WNE ÷ (WNE + NWNE) × 100. Verified to produce an exact match for all
4,004 institutions where both source values are available.
Sample Verification Detail — Old Dominion University
Both datasets were validated exhaustively by automated comparison. The following is a
spot-check of Old Dominion University (UNITID 232982), also verified manually against
IPEDS Institution Profile
and College Scorecard on February 14, 2026.
Institutional Outcomes (Scorecard)
| Field | Backend | Scorecard Source | Result |
| Graduation Rate | 44.4% | 44.35% (C150_4 = 0.4435) | ✓ |
| Median Debt | $24,000 | $24,000 | ✓ |
| Employment Rate | 86.8% | 86.8% (6,004 ÷ 6,916) | ✓ |
| Enrollment | 17,521 | 17,521 | ✓ |
| Median Earnings (10yr) | $54,914 | $54,914 | ✓ |
Program Completions (IPEDS) — Top 15
| CIP | Program | Backend | NCES M1 | NCES Total | Result |
| 52.0101 | Business/Commerce, General | 1,003 | 1,003 | 1,005 | ✓ |
| 11.0101 | Computer & Info Sciences, General | 824 | 824 | 824 | ✓ |
| 11.0103 | Information Technology | 480 | 480 | 480 | ✓ |
| 42.0101 | Psychology, General | 406 | 406 | 409 | ✓ |
| 11.0802 | Data Modeling/Warehousing | 346 | 346 | 346 | ✓ |
| 26.0101 | Biology/Biological Sciences, General | 329 | 329 | 335 | ✓ |
| 43.0107 | Criminal Justice/Police Science | 328 | 328 | 331 | ✓ |
| 13.1001 | Special Education, General | 285 | 285 | 285 | ✓ |
| 13.0301 | Curriculum and Instruction | 261 | 261 | 261 | ✓ |
| 51.3801 | Registered Nursing | 260 | 260 | 260 | ✓ |
| 22.0101 | Law | 237 | 237 | 237 | ✓ |
| 52.0201 | Business Admin & Management | 223 | 223 | 224 | ✓ |
| 44.0701 | Social Work | 183 | 183 | 183 | ✓ |
| 51.2208 | Community Health & Preventive Med | 157 | 157 | 157 | ✓ |
| 45.0601 | Economics, General | 155 | 155 | 157 | ✓ |
All 180 ODU programs verified. Backend = Semantic Insight value.
NCES M1 = MAJORNUM=1 subtotal (should match).
NCES Total = grand total including double majors.
How to Verify
Any data point in a Semantic Insight institution analysis can be independently verified:
Program completions: Download the Completions dataset from
nces.ed.gov/ipeds/datacenter/DataFiles.aspx
→ Year 2024 → Survey "Completions" → Data file "C2024_A". Open the CSV and filter by
UNITID (shown on the institution page as "Verify on IPEDS") and CIPCODE.
Institutional outcomes (graduation rate, debt, employment, earnings): Download
"Most Recent Institution-Level Data" from
collegescorecard.ed.gov/data.
Filter by UNITID. Graduation rate is in column C150_4 (multiply by 100),
median debt in GRAD_DEBT_MDN, and earnings in MD_EARN_WNE_P10. Employment rate is
COUNT_WNE_P10 ÷ (COUNT_WNE_P10 + COUNT_NWNE_P10) × 100.
Notes & Known Limitations
MAJORNUM=1 filter. The backend counts first-major students only. For
programs where students commonly double-major, the backend total will be slightly lower
than the NCES grand total. This affects 15,044 of 121,120 records (12.4%); the median
difference is 1 completer. This is standard methodology matching the College Scorecard.
Award-level aggregation. Completions are summed across all award levels
(certificates, bachelor's, master's, doctoral) into a single per-program total. The
individual levels present are retained in metadata.
Derived metrics. The roi_ratio field
(3,683 institutions) and employment_rate field (4,004 institutions)
are calculated from verified source values using deterministic formulas. Accurate inputs
guarantee accurate outputs.
Null/suppressed values. The College Scorecard publishes "NA" or
"PrivacySuppressed" when sample sizes are too small. The backend correctly stores these
as null. Of the 5,141 backend institutions, 150 are system offices or very new branch
campuses not independently present in the Scorecard file.
Data Currency & Refresh Schedule
| Dataset | Current Version | Publication Cycle | Next Expected Release |
| IPEDS Completions | 2023-24 (C2024_A) | Annual (fall) | ~Fall 2026 (2024-25 data) |
| College Scorecard | Nov 2025 release | ~Twice per year | ~Spring 2026 |
When NCES or the Department of Education publishes updated data, we re-run this validation
process against the new source files before deploying updates.