Ch 2 — AI in the HR Tech Stack

How AI systems actually work inside your HR platforms — data flows, model types, and integration patterns

Index ← High Level

Under the Hood

swap_horiz

Data Flow

arrow_forward

model_training

Model Types

arrow_forward

integration_instructions

Integration

arrow_forward

lock

Vendor Lock-in

arrow_forward

shield

Data Privacy

arrow_forward

account_tree

Architecture

Click play or press Space to begin the deep dive...

Step- / 8

swap_horiz

How Your Data Flows Between HR Systems

Where AI actually sits in the data pipeline from HRIS to ATS to payroll to benefits

The System of Record Problem

Your HR tech stack isn’t one system — it’s a constellation. Your HRIS (Workday, UKG, BambooHR) holds the employee master record. Your ATS (Greenhouse, Lever, iCIMS) holds candidate data. Your payroll system holds compensation history. Your benefits platform holds enrollment and claims data. Your LMS holds training records.

Data flows between these systems through integrations — file feeds, APIs, middleware like Workato or MuleSoft. The critical question: when an AI feature processes data, which system is the source of truth, and where does the AI output go? If your ATS AI scores a candidate, does that score feed back into the HRIS when they’re hired? Does the HRIS AI use that score for future decisions? These connections create invisible dependencies.

Data Flow Architecture

HRIS (Source of Truth) // Employee master data: name, role, dept, // manager, hire date, status, comp history ↓ ↓——feeds——→ Payroll System ↓ // Comp, tax, deductions ↓ ↓——feeds——→ Benefits Platform ↓ // Eligibility, enrollment ↓ ↓——feeds——→ LMS / Performance // Training, reviews, goals ATS (Candidate Pipeline) // Resumes, scores, interview notes ↓ ↓——on hire—→ HRIS // Candidate becomes employee record AI TOUCHPOINTS: • ATS: resume screening, candidate ranking • HRIS: attrition prediction, org analytics • Benefits: recommendation engines • LMS: personalized learning paths • Payroll: anomaly detection

Ops insight: Map every AI touchpoint in your stack. If three different vendors each have AI features that use employee data, you effectively have three separate AI systems making inferences about your people — each with its own data quality issues, bias risks, and privacy implications. Nobody is coordinating between them unless you are.

model_training

What “Model Types” Your Vendors Use

The four model categories that power HR AI features and where each one shows up

Classification Models (Yes/No Decisions)

A classification model sorts things into categories. In HR, that means: qualified vs. not qualified (resume screening), high risk vs. low risk (flight risk flags), compliant vs. non-compliant (policy violation detection). Think of it as a very fast, very consistent rules-based sorter — except it learned the rules from data rather than having them written by a human. The danger: it may have learned rules you wouldn’t endorse, such as “candidates from certain schools are always qualified.”

Regression Models (Predicting Scores)

Regression models output a number on a scale rather than a yes/no. In HR: attrition risk score (0–100%), engagement score prediction, time-to-fill estimates, compensation benchmarking. These are the models behind dashboards that show you risk heat maps and predicted outcomes. The key question: what’s the confidence interval? A model that says “72% attrition risk” but is only accurate plus or minus 30% is essentially guessing.

NLP Models (Text Analysis)

Natural Language Processing models analyze text. In HR: survey sentiment analysis (is this comment positive or negative?), resume parsing (extracting skills and experience from unstructured text), chatbot understanding (what is the employee asking?), job description optimization (flagging biased language). These range from simple keyword matchers to full LLMs. Ask your vendor: is this a keyword model or a language model? The difference in capability and cost is enormous.

Recommendation Engines

Benefits recommendation: “Based on your profile, consider the HSA plan.” Learning paths: “Employees in similar roles took these courses.” Job matching: “Internal openings that match your skills.” These work like Netflix suggestions — they find patterns in what similar people chose and predict what you’d choose. The risk: if similar people historically made suboptimal choices (e.g., under-enrolling in benefits), the AI perpetuates that pattern.

Vendor question: “What type of model powers this feature?” Many vendors say “AI” when they mean “rules engine with a few statistical models.” Knowing the model type tells you what questions to ask about accuracy, bias, and data requirements. A classification model needs different oversight than a recommendation engine.

integration_instructions

API vs. Embedded AI

Why the architecture of how AI is connected matters for privacy, cost, and control

Two Architectures

When your vendor says their product “uses AI,” the AI can live in two very different places. Embedded AI means the AI model is built into the vendor’s own platform — they trained it, they host it, it runs inside their infrastructure. API-based AI means the vendor calls out to an external AI service (like OpenAI, Anthropic, or Google) every time the feature is used. Your data leaves the vendor’s environment, travels to the AI provider, gets processed, and the result comes back.

Think of it like catering: embedded AI is an in-house kitchen. API-based AI is ordering from an external restaurant. Both can serve food, but the supply chain, quality control, and liability are completely different.

Comparison

Embedded AI

Data stays within vendor’s environment. You have one DPA to manage. Vendor controls the model, training data, and updates. Performance is consistent. But: you can’t inspect or swap the model. If their AI is biased, your only option is to lobby for a fix or leave.

API-Based AI

Data travels to a third party. You now have a sub-processor chain. The external AI provider may change their model without notice. Your data may be used for training unless explicitly opted out. But: the vendor can swap AI providers, and the external models are often more capable.

Critical question for every vendor: “Does your AI feature send employee data to any third-party AI service? If so, which one, under what DPA, and is our data used for model training?” Many vendors quietly started piping data to OpenAI in 2023–2024. If your DPA doesn’t cover this sub-processor, you may have an undisclosed data sharing arrangement involving your employees’ PII.

shield

Where Your Employee Data Goes

Tracing the path of a single employee record through AI processing

The Journey of a Data Point

When an AI feature processes an employee’s performance review, here’s what can happen. The text of the review leaves your HRIS. It goes to the vendor’s AI processing layer. If the vendor uses API-based AI, it travels to a third-party AI provider. The AI provider may store it temporarily for processing, or may retain it for model improvement unless you’ve opted out. The AI output (a sentiment score, a summary, a flag) returns to the vendor and appears in your dashboard.

At each hop, ask: Who can see this data? Where is it stored? How long is it retained? Is it used beyond this specific request? The answers often surprise even seasoned IT teams.

Training vs. Inference

Training means your data is used to improve the AI model itself — it becomes part of the model’s learned patterns. This is a much bigger deal than inference. Inference means the AI processes your data to produce a result but doesn’t learn from it permanently. Most enterprise AI vendors promise inference-only, but verify this in writing. If your 10,000 employees’ performance reviews train a vendor’s model, that model now carries patterns from your workforce into every other customer’s predictions.

Data Audit Checklist

FOR EACH AI FEATURE, DOCUMENT: 1. What data enters the AI? // Names? Salaries? Reviews? Demographics? 2. Where is it processed? // On-premise? Vendor cloud? Third-party? // Which region / data center? 3. Who are the sub-processors? // AWS? Azure? OpenAI? Google? // Do they have their own sub-processors? 4. Is data used for model training? // Get this in writing. "No" should be in // the DPA, not just a sales deck 5. How long is data retained? // Processing-only? 30 days? Indefinitely? 6. Can an employee request deletion? // GDPR Article 17 / CCPA right to delete // Can the vendor actually purge from the AI? 7. What happens to data if you churn? // Deleted? Retained? Used for training // other customers' models?

Non-negotiable: Run this checklist for every vendor in your stack that has AI features. If any vendor cannot answer question 4 with a clear “no, we do not use customer data for model training” backed by contractual language, escalate immediately to legal and procurement. This is the single biggest undisclosed risk in HR tech right now.

lock

The Vendor Lock-in Problem

How AI features increase switching costs and what to do about it

Why AI Makes Lock-in Worse

Traditional vendor lock-in is about data and workflows: your employee records live there, your processes depend on their UI. AI lock-in adds new layers:

Model dependency: The AI learned patterns from your data. If your vendor’s attrition model was tuned on 5 years of your workforce data, that tuning doesn’t transfer to a new vendor. You start from scratch.

Workflow embedding: Your approval chains, escalation paths, and exception processes now depend on AI predictions. “Flag employees above 80% attrition risk for a stay interview” — that workflow is meaningless without the model behind it.

Historical baseline: Your trend data, benchmark comparisons, and improvement metrics are all tied to that vendor’s specific model. A new vendor’s model will produce different scores for the same employees, breaking your historical continuity.

Mitigation Strategies

1. Own your data extracts: Negotiate contractual rights to export not just raw data but AI-generated scores, labels, and predictions in a standard format. If the AI tagged employees as “high flight risk,” you need those tags exportable.

2. Document the logic, not just the output: Require vendors to explain the key factors in their models. If their attrition model weighs “time since last promotion” heavily, you can rebuild that logic elsewhere.

3. Avoid proprietary metrics: If a vendor invents a “Talent Score” that only exists in their ecosystem, you’re locked in. Push for models that output industry-standard measures.

4. Build parallel processes: For critical decisions, maintain a human-driven backup process that doesn’t depend on any vendor’s AI. If the AI disappeared tomorrow, could you still run your operation?

Contract leverage: The time to negotiate data portability and AI transparency is before you sign, not after you’re embedded. Add explicit clauses about AI model documentation, score exportability, and data return upon termination. Your procurement team may not think to ask — this is where HR Operations adds value at the contracting stage.

construction

Build vs. Buy vs. Configure

A decision framework for how to add AI capabilities to your HR stack

Three Paths to AI in HR

Buy (Vendor Built-in): Use the AI features already in your existing platforms. Workday’s skills intelligence, Greenhouse’s resume scoring, UKG’s predictive analytics. Lowest effort, least control. The AI is a black box — you use it or you don’t.

Buy (Specialized Add-on): Purchase a dedicated AI product that integrates with your stack. Tools like Eightfold for talent intelligence, Textio for job posting optimization, or Paradox for conversational AI. More capability, more vendors to manage, more integration points.

Configure (DIY): Set up an AI tool yourself for specific HR use cases. A GPT-based assistant for HR policy Q&A, a workflow automation that uses AI for ticket routing, or a custom analytics dashboard with predictive elements. Most control, most effort, requires some technical capability.

Decision Framework

USE VENDOR BUILT-IN WHEN: • The feature is low-risk (scheduling, FAQ) • You lack internal technical resources • The vendor’s model is audited and documented • You’re OK with limited customization BUY A SPECIALIZED ADD-ON WHEN: • The use case is critical (hiring, retention) • Your main vendor’s AI is weak in this area • The add-on vendor specializes in HR AI • You need audit trails and bias reporting CONFIGURE YOUR OWN WHEN: • The use case is internal and low-stakes • You need domain-specific knowledge • You want full control over data handling • You have someone who can maintain it NEVER DO ANY OF THESE WITHOUT: // A data privacy review // A clear owner for ongoing governance // A rollback plan if it goes wrong

The hidden option: “Wait.” Not every process needs AI right now. If your data quality is poor, your processes aren’t standardized, or you don’t have governance in place, deploying AI will amplify those problems, not fix them. Sometimes the right answer is to fix the foundation first.

gavel

Data Privacy Architecture

GDPR, CCPA, and state privacy laws applied specifically to AI in HR operations

The Legal Landscape for HR AI

Privacy regulations weren’t written for AI, but they apply directly. GDPR Article 22 gives individuals the right not to be subject to purely automated decision-making, including profiling, that significantly affects them. Employment decisions absolutely qualify. CCPA/CPRA gives California employees rights over their personal information, including the right to know what’s collected and how it’s used. State-level AI laws are accelerating — Illinois BIPA covers biometric data in video interviews, NYC Local Law 144 requires bias audits for automated employment decisions, and Colorado’s AI Act imposes obligations on “high-risk” AI systems, which include hiring tools.

The practical reality: if you have employees or candidates in multiple states or countries, you need a privacy architecture that meets the strictest applicable standard, not just your headquarters’ local law.

What Your DPA Should Say About AI

DATA PROCESSING AGREEMENT — AI CLAUSES 1. Purpose Limitation // Data processed by AI features shall be // used ONLY for the stated service purpose. // NOT for model training, benchmarking, or // improvement of services to other customers. 2. Sub-processor Disclosure // All AI sub-processors (OpenAI, Anthropic, // AWS Bedrock, etc.) must be listed by name. // 30-day notice before adding new ones. 3. Data Residency // AI processing shall occur within [region]. // No cross-border transfers without consent. 4. Automated Decision Transparency // Vendor shall provide, upon request, the // logic, key factors, and confidence levels // for any AI-driven output used in employment // decisions. 5. Right to Opt Out // Customer may disable AI features for // specific employee populations without // degrading core service functionality. 6. Audit Rights // Customer or third-party auditor may // inspect AI bias testing results, data // flows, and retention practices annually.

Governance reality: Most existing DPAs were signed before vendors added AI features. If your Workday or ATS contract pre-dates 2023, it almost certainly does not address AI sub-processors, model training data use, or automated decision transparency. Schedule a contract review with legal and procurement — this is urgent, not a nice-to-have.

account_tree

Building Your System Map

A practical exercise: documenting every AI touchpoint in your HR tech stack

Why You Need This Map

Right now, AI features are being activated across your HR tech stack — sometimes by vendors pushing updates, sometimes by well-meaning team members, sometimes by IT responding to executive requests. Without a comprehensive system map, you have no single view of where AI touches employee data, no way to assess cumulative risk, and no way to respond to an employee who asks “how is my data being used?”

This exercise creates that map. It should be owned by HR Operations and updated quarterly. It becomes your foundation for every AI governance decision, every vendor review, and every compliance audit. Think of it as your AI asset inventory — you can’t govern what you can’t see.

How to Complete This

Step 1: List every system in your HR tech stack (HRIS, ATS, payroll, benefits, LMS, engagement, analytics).
Step 2: For each system, identify every feature marketed as AI, ML, “intelligent,” “smart,” or “predictive.”
Step 3: Fill in the template for each feature.
Step 4: Draw the data flow arrows between systems.
Step 5: Flag any feature where you can’t complete the template — those are your immediate action items.

System Map Template

AI TOUCHPOINT INVENTORY ================================ System: [e.g., Workday] Vendor: [e.g., Workday Inc.] AI Feature: [e.g., Skills Cloud Intelligence] Model Type: [Classification / Regression / NLP / Rec] Architecture: [Embedded / API-based / Unknown] Sub-processors: [List all third-party AI providers] DATA FLOW Input data: [What employee data enters the AI?] Data source: [Which system is the source of truth?] Output: [What does the AI produce?] Output used by: [Which systems/processes consume it?] PRIVACY & GOVERNANCE DPA covers AI? [Yes / No / Needs review] Training opt-out: [Confirmed in writing? Y/N] Data residency: [Where is AI processing hosted?] Bias audit: [Done? Date? By whom?] Human review: [Who reviews AI output before action?] RISK ASSESSMENT Decision impact: [Low / Medium / High] Lock-in risk: [Low / Medium / High] Data sensitivity: [PII / Sensitive / Regulated] Rollback plan: [Can you disable without service loss?] Owner: [Name and role of accountable person] // Complete one of these for EVERY AI feature // in EVERY system in your stack. // If you can’t fill a field, that’s a finding.

Next up: Chapter 3 dives into the specific HR processes where AI is deployed today — recruiting, onboarding, performance management, compensation, and offboarding — with concrete examples of what’s working, what’s failing, and what questions to ask for each one.