Ch 4: AI for Recruiting & Talent

Ch 4 — AI for Recruiting & Talent

The technical mechanics of recruiting AI — parsing, matching, bias encoding, auditing, and compliance
Regulatory content current as of March 2026 — verify before acting on legal guidance

Index ← High Level

Under the Hood

database

Data

arrow_forward

tune

Features

arrow_forward

model_training

Model

arrow_forward

score

Score

arrow_forward

fact_check

Audit

arrow_forward

verified

Comply

Click play or press Space to begin the deep dive...

Step- / 8

description

How Resume Parsing Actually Works

NLP tokenization, named entity recognition, and why formatting breaks everything

The Parsing Pipeline

When a resume hits your ATS, the parser runs through a multi-step NLP pipeline:

1. Document conversion: PDF/DOCX is converted to raw text. Tables, columns, headers, and images are often lost or mangled at this step.
2. Tokenization: The text is broken into individual words and phrases (“tokens”). “Senior Software Engineer” might become three tokens or one, depending on the parser.
3. Named Entity Recognition (NER): The model identifies which tokens are names, companies, job titles, dates, skills, degrees, and institutions.
4. Relationship extraction: Which skills belong to which job? Which dates correspond to which employer?
5. Normalization: “Sr.” becomes “Senior.” “MIT” becomes “Massachusetts Institute of Technology.”

Why Formatting Matters

PARSES WELL: Simple, single-column layout Standard section headers (Experience, Education) Consistent date formatting (Jan 2022 – Present) Plain text or simple PDF PARSES POORLY: Multi-column layouts (left sidebar + main body) Tables for formatting Graphics, icons, or progress bars for skills Non-standard section headers ("My Journey") Scanned PDFs (image, not text) International date formats (27/03/2026)

Non-Standard Resumes

Career changers: NER models expect linear career paths. A teacher-to-developer transition confuses parsers that weight “years in role” heavily.
Employment gaps: Most parsers flag gaps but can’t understand context (caregiving, illness, travel).
International formats: CV styles from India, Germany, or Japan differ dramatically from US norms. Parsers trained on US resumes struggle with these.

Technical reality: Even the best parsers have a 10–15% error rate on field extraction. That means for every 100 resumes, 10–15 have at least one incorrectly parsed field. If that field is used for screening, qualified candidates get eliminated for formatting, not qualifications.

join_inner

Matching Algorithms Explained

TF-IDF, embeddings, LLMs, and why “managed a team” should match “led direct reports”

Three Generations of Matching

Generation 1: TF-IDF (Term Frequency–Inverse Document Frequency)
Counts how often words appear in a resume vs. how common they are overall. If “Kubernetes” appears in your resume but is rare across all resumes, it gets a high score. Problem: it’s still just word counting. “Managed a team” and “led direct reports” score completely differently.

Generation 2: Embeddings
Words and phrases are converted into numerical vectors (lists of numbers) that capture meaning. “Managed a team” and “led direct reports” produce similar vectors because they mean similar things. This is the breakthrough that made semantic matching possible.

Generation 3: LLM-Based Matching
Large language models understand context, nuance, and inference. They can reason that “reduced customer churn by 15%” is relevant to a “customer success manager” role even if the exact title never appears.

Cosine Similarity in Plain Language

The Core Idea: Every piece of text gets converted into a list of numbers (a "vector"). Think of it as coordinates in a vast space of meaning. Cosine Similarity: How much do two vectors point in the same direction? Score ranges from 0 (unrelated) to 1 (identical meaning). Example: "managed a team of 12" "led direct reports" Cosine similarity: ~0.87 (very similar) "managed a team of 12" "proficient in Excel" Cosine similarity: ~0.14 (unrelated) // This is why semantic matching finds // candidates that keyword matching misses. // The math captures meaning, not just words.

Vendor question: “Does your matching use keyword matching, embeddings, or LLM-based reasoning?” The answer tells you how sophisticated (and expensive) their AI actually is. Many vendors claim “AI matching” but still use TF-IDF under the hood.

warning

How Bias Gets Encoded

From training data to proxy variables — a concrete walkthrough of bias in the pipeline

The Bias Pipeline

Bias doesn’t enter at one point — it compounds through the entire pipeline. Here’s a concrete example of how it works in a resume screening model:

Step 1 — Training data: You feed the model 5 years of hiring decisions. Your company historically hired 70% men for engineering roles. The model learns: “male-associated features predict hire.”

Step 2 — Feature selection: You remove gender from the data. But the model finds proxy variables: university names (all-women’s colleges), extracurriculars (“women’s rugby”), even name patterns.

Step 3 — Threshold effects: You set the screening cutoff at 75%. Due to biased training, women score 6 points lower on average. The cutoff eliminates 40% of female applicants but only 25% of male applicants.

The Bias Pipeline in Code

STAGE 1: TRAINING DATA Historical hires: 70% male, 30% female // Model learns: male patterns = "good hire" STAGE 2: FEATURE SELECTION Removed: gender, race, age Still present as proxies: ├ University name (→ gender, class) ├ Zip code (→ race, income) ├ Name patterns (→ ethnicity) ├ Employment gaps (→ gender) ├ Extracurriculars (→ gender, culture) STAGE 3: MODEL OUTPUT Avg score, male applicants: 78 Avg score, female applicants: 72 Screening threshold: 75 STAGE 4: RESULT Male pass rate: 62% Female pass rate: 38% Four-fifths test: 38/62 = 61% — FAIL // Disparate impact even though gender // was never an explicit input.

The core problem: Removing protected attributes from the input does not remove bias from the output. Proxy variables carry the same information through different channels. This is why bias audits test outcomes, not inputs.

calculate

Adverse Impact Analysis

The four-fifths rule with real numbers, selection rates, and what triggers an EEOC audit

Calculating Selection Rates

Adverse impact analysis compares selection rates across demographic groups at each stage of the hiring funnel. A selection rate is simply: (number selected) / (number who applied) for each group.

The four-fifths (80%) rule: If any group’s selection rate is less than 80% of the highest group’s rate, there is prima facie evidence of adverse impact. This isn’t automatic proof of discrimination — but it shifts the burden to the employer to demonstrate the selection criterion is job-related and consistent with business necessity.

What triggers an audit: A candidate complaint, a pattern in EEOC charges, an OFCCP compliance review (for federal contractors), or a state-level investigation. NYC Local Law 144 now requires annual bias audits for all automated employment decision tools.

Worked Example

AI SCREENING: 500 APPLICANTS Group Applied Passed Rate White 200 120 60% Black 120 48 40% Hispanic 90 45 50% Asian 90 54 60% Highest rate: 60% (White & Asian) 80% threshold: 48% (60% × 0.80) Four-fifths test: White: 60/60 = 100% — PASS Black: 40/60 = 67% — FAIL Hispanic: 50/60 = 83% — PASS Asian: 60/60 = 100% — PASS Adverse impact detected for Black applicants. // This analysis must be run at EVERY stage // where AI filters candidates: screening, // assessment, interview selection, offer.

EEOC enforcement: The iTutorGroup case ($365K, 2023) targeted age-based algorithmic rejection. Though the EEOC’s 2023 AI guidance was pulled from its website, enforcement continues under Title VII and ADEA — the underlying statutes have not changed. The Eightfold AI class action (Jan 2026) adds a new front: alleged secret applicant scoring dossiers may violate FCRA. The law doesn’t care whether a human or an algorithm made the discriminatory decision.

visibility

Explainability Requirements

GDPR Article 22, NYC Local 144, and what “explainable AI” actually means in practice

Can the AI Explain Why?

When an AI rejects a candidate, can it explain the reasoning? This isn’t just an ethical question — it’s increasingly a legal requirement.

GDPR Article 22: EU residents have the right to not be subject to decisions based solely on automated processing that significantly affect them. Employment decisions qualify. When automated decisions are made, individuals have the right to obtain an explanation and to contest the decision.

NYC Local Law 144: Employers using automated employment decision tools (AEDTs) in New York City must conduct annual bias audits by an independent auditor, publish audit results on their website, and notify candidates that an AEDT is being used.

In practice: “Explainable AI” means the system can identify which factors contributed to a score — not just that the candidate scored 72, but why: missing skill X, insufficient years in role Y, low semantic match on competency Z.

Explainability Spectrum

BLACK BOX (Not Explainable) "The candidate scored 42 out of 100." // No insight into why. Cannot audit. // Cannot defend in legal challenge. PARTIAL EXPLAINABILITY "Top factors: skills match (low), experience years (medium), education match (high)." // Better. Shows which categories matter. // Still doesn't show how they interact. FULL EXPLAINABILITY "Score: 42. Key factors: - Missing 3 of 5 required skills (-22) - 2 yrs experience vs 5 required (-14) - Strong education match (+8) - Semantic match on leadership: 0.83 Confidence: 78%. Similar candidates who were hired: 12% at this score." // This is what you should demand. // Auditable, defensible, actionable.

Operational standard: If your AI screening tool cannot produce a factor-level explanation for every score, it cannot be audited for bias, defended in litigation, or meaningfully reviewed by a human. Full explainability is not a nice-to-have — it is a compliance requirement in a growing number of jurisdictions.

balance

De-biasing Techniques

What works, what doesn’t, and what’s just marketing

Techniques That Have Evidence

Removing protected attributes: Necessary but not sufficient. As we showed in Step 3, proxy variables carry the same signal. It’s a starting point, not a solution.

Adversarial de-biasing: Training a second model whose job is to detect whether the primary model’s outputs correlate with protected attributes. If the “adversary” can guess the candidate’s demographic group from the score, the primary model is biased and gets penalized. This has real research backing but requires significant technical investment.

Calibration across groups: Adjusting score thresholds so that each demographic group has equivalent selection rates. Effective for outcome fairness but controversial — some argue it constitutes differential treatment.

Blind resume review: Removing names, addresses, universities, and dates before AI screening. Reduces some proxy bias but doesn’t eliminate it (writing style, vocabulary, and career patterns still carry signal).

What Works vs. Marketing

Mostly Marketing

“Our AI is bias-free.” No AI is bias-free — this is a red flag, not a selling point. “We removed race and gender from the data.” Necessary but not sufficient. “Our diverse training data ensures fairness.” Diverse data helps but doesn’t guarantee fair outcomes. Demand audit results, not claims.

Backed by Evidence

“We conduct quarterly adverse impact analyses.” “We use adversarial de-biasing and can share the methodology.” “We publish our bias audit results and model card.” “We support blind screening and configurable feature exclusion.” These are verifiable, auditable claims.

The honest truth: No de-biasing technique eliminates all bias. The goal is monitoring, measurement, and continuous improvement — the same approach you use for any operational risk. Vendors who promise zero bias are either naive or dishonest.

assessment

Audit Frameworks

How to structure a bias audit: methodology, independence, cadence, and intersectionality

Audit Structure

1. Data analysis: Collect demographic data on all candidates processed by the AI at each funnel stage. Calculate selection rates by group.

2. Disparate impact testing: Apply the four-fifths rule at each stage. Identify which groups are disproportionately affected and at which pipeline stages.

3. Intersectional analysis: Test not just single categories (race, gender) but intersections (Black women, older Hispanic men). Bias often hides at intersections that single-axis analysis misses.

4. Feature importance review: Examine which input features most influence the model’s scores. Flag any feature that correlates strongly with a protected attribute.

5. Ongoing monitoring: Set up dashboards that track selection rates in real time, with alerts when any group’s rate drops below the four-fifths threshold.

Who, When, and How

WHO SHOULD AUDIT Independent third party — not the vendor, not your internal team. Look for: ├ I/O psychologists ├ AI fairness consultants ├ Employment law firms with AI expertise ├ Firms compliant with NYC LL144 standards CADENCE Annual: Full comprehensive audit Quarterly: Selection rate monitoring Continuous: Real-time dashboard alerts Trigger-based: After model updates, new features, or complaints DOCUMENTATION REQUIREMENTS [ ] Audit methodology described [ ] Data sources identified [ ] Statistical methods documented [ ] Results by group reported [ ] Intersectional analysis included [ ] Remediation actions listed [ ] Auditor independence attested

NYC LL144 standard: The law requires an “independent auditor” but does not define qualifications. Best practice: use a firm with I/O psychology expertise and demonstrated experience in EEOC-style adverse impact analyses. The audit must be published on your website — it will be scrutinized.

gavel

The Compliance Stack

Federal, state, local, and international laws governing AI in recruiting

Federal Framework

EEOC (Title VII, ADA, ADEA): Existing anti-discrimination laws apply to AI decisions. The EEOC’s 2023 guidance on AI in hiring was removed from its website following the rescission of the Biden-era AI executive order — but this does not reduce employer liability. Title VII, ADA, and ADEA still apply in full. If AI produces disparate impact, the employer is liable regardless of whether federal guidance documents exist. The four-fifths rule applies regardless of whether a human or algorithm makes the selection.

OFCCP: Federal contractors face additional scrutiny. OFCCP requires affirmative action plans and can audit your selection procedures, including AI tools, at any time. If your AI screening creates adverse impact, you must demonstrate job-relatedness and business necessity.

Key principle: The employer is responsible for the AI’s decisions, even if the vendor built the tool. “The vendor told us it was fair” is not a legal defense. The Eightfold AI class action (January 2026) alleges the platform generated secret applicant scoring dossiers without disclosure, potentially violating the Fair Credit Reporting Act and California consumer reporting laws — a landmark case for AI hiring liability.

Compliance Checklist by Jurisdiction

FEDERAL (ALL US EMPLOYERS) [ ] Title VII / ADA / ADEA compliance // EEOC 2023 AI guidance was pulled, // but the underlying law still applies [ ] OFCCP compliance (if fed contractor) [ ] FCRA compliance for AI scoring tools // See Eightfold AI class action (Jan 2026) STATE LAWS [ ] California CRD Regulations (Oct 2025): Bias testing, human oversight, 4-yr records [ ] Colorado AI Act / SB 24-205 (June 2026): Annual impact assessments, transparency [ ] Texas TRAIGA / HB 149 (Jan 2026): No AI use with intent to discriminate // Rejects disparate impact as standalone [ ] Illinois AIPA: Video interview AI consent [ ] Maryland: Facial recognition ban LOCAL LAWS [ ] NYC Local Law 144: Annual bias audit + published results + notice INTERNATIONAL [ ] EU AI Act: Employment AI classified as "high-risk" — requires conformity assessment, transparency, human oversight [ ] GDPR Art. 22: Right to not be subject to solely automated decisions COMING SOON [ ] Federal AI legislation expected late 2026 / early 2027 to harmonize the state patchwork

The landscape is moving fast. The rescission of the Biden-era AI executive order and removal of EEOC guidance has not reduced liability — Title VII, ADA, and ADEA enforcement continues unchanged. Meanwhile, states like California (CRD regulations, Oct 2025), Colorado (AI Act, June 2026), and Texas (HB 149, Jan 2026) are filling the federal gap. Federal AI legislation is expected by late 2026 or early 2027 to harmonize the state patchwork. Build your AI recruiting governance to the highest standard you face.

Sources & Further Reading

• EEOC, “Select Issues: Assessing Adverse Impact in Software, Algorithms, and AI” (guidance removed 2025; underlying Title VII framework unchanged)
• NYC Local Law 144 (2023) — Automated Employment Decision Tools
• California Civil Rights Department, Regulations on Automated Decision Systems (effective Oct 2025)
• Colorado SB 24-205, AI Consumer Protections Act (effective June 2026)
• Texas HB 149, Texas Responsible AI Governance Act (effective Jan 2026)
• Illinois Artificial Intelligence Video Interview Act (2020)
• Eightfold AI class action filing (Jan 2026) — FCRA and California consumer reporting claims
• EU Artificial Intelligence Act, Regulation 2024/1689

Regulatory content current as of March 2026. Verify before acting on legal guidance.