← All Insights Sammalkko

The AI Talent Stack in 2026: Where the Market Stands

2026-05-21 Antti Virtanen

When we started Sammalkko in 2021, the phrase "AI HR tech" was doing a lot of heavy lifting. It meant anything from a chatbot answering leave policy questions to a full workforce intelligence platform predicting attrition with 90-day precision. The category had a marketing definition, not a structural one. Three years of active investment later — 13 portfolio companies, two fund cycles, hundreds of companies reviewed — the stack has actually clarified. Not because the hype settled, but because enterprise buyers started buying, and the deals that closed told you where the real infrastructure sits.

This is my attempt to sketch where the market stands in mid-2026. It is not a comprehensive landscape report. It is a map of the four layers we now think about when we evaluate a deal, annotated with what we have actually seen in the field.

Layer 1: People Data Infrastructure

Everything else fails without this. The hardest truth in AI HR tech is that most enterprise people data is not in one place, not in a consistent schema, and not clean enough to run inference on without significant preprocessing. HRIS systems like Workday or SAP SuccessFactors hold the authoritative record for headcount and compensation, but the signal about how people actually work — performance patterns, collaboration data, skill adjacencies, tenure velocity — lives in five other systems that were never designed to talk to each other.

The infrastructure layer is less glamorous than the intelligence layer on top of it, but it is where durable businesses get built. Companies solving the data unification problem — normalizing talent data across HRIS, ATS, LMS, and performance tooling — have a natural moat: integrations are expensive to replicate, and once you are the canonical people data source for an enterprise, displacement requires them to redo every downstream connection. We invested in this layer because we saw it as the foundation. Products built on inconsistent data produce inconsistent predictions, and inconsistent predictions erode trust faster than they were built.

Layer 2: Talent Acquisition Intelligence

This is the most crowded layer and the most misunderstood one. The obvious application — better resume screening — is real but not the durable value proposition. Resume screening at scale has been solvable since the early 2010s. The insight that separates interesting companies from undifferentiated tools is that talent acquisition is not primarily a matching problem. It is a prediction problem: which candidates will succeed in this role, in this team, in this organizational context? The matching framing produces keyword extraction engines. The prediction framing produces something more valuable and harder to replace.

Semantic matching — moving from keyword overlap to embedding-based similarity across skill vectors — is now table stakes for any serious ATS plugin or standalone sourcing tool. The companies building ahead of that benchmark are the ones that have closed the feedback loop: using recruiter signals and post-hire outcome data to continuously update the ranking model. Fetcher, which we backed in 2023, is one example of a company thinking seriously about this loop. The sourcing product is real, but the long-term moat is in the data flywheel that makes sourcing recommendations better with every hire the platform processes.

Layer 3: Workforce Intelligence and Planning

This is where enterprise budgets are starting to move meaningfully, and it is structurally more defensible than the acquisition layer because the buyer is typically the CHRO and CFO jointly — not just TA leadership. Workforce planning involves headcount modeling, skills gap analysis, internal mobility prediction, and attrition risk forecasting. It touches finance and people operations simultaneously, which means the sales cycle is longer and the switching cost is higher once you are embedded.

The challenge is data maturity. Workforce planning models are only as credible as the underlying skills data and historical headcount data they are trained on. Most enterprises in the 1,000–10,000 employee range have neither in a usable state. The best companies in this layer are solving the data onboarding problem explicitly — they build structured skills extraction into the product as an early step, so the platform is collecting training signal from day one rather than waiting for the customer to arrive with clean data. Retrain.ai, which we led at Seed in 2023, is the clearest example of this approach: skills inference runs on existing CV and job history data from day one, so the model is generating useful output before any IT integration project completes.

Layer 4: Learning and Development Automation

L&D is the layer that has changed most dramatically in the past 18 months. The shift is not in content delivery — SCORM-compliant LMS platforms have been commoditizing for a decade. The shift is in content creation and personalization. Large language models have made it practical to generate role-specific learning content at scale, to adapt that content to individual skill gaps identified in the intelligence layer, and to surface it at the moment of need rather than in scheduled quarterly cycles.

The defensible position in L&D is not content. Content is infinitely generatable and therefore infinitely commoditizable. The defensible position is the skill graph: a proprietary mapping of how skills relate to roles, to learning paths, to performance outcomes, and to career trajectories within a specific organizational context. Companies that build a rich skill ontology as a core product asset — rather than as a byproduct of content delivery — are building something that gets more valuable as the organization grows. Sana Labs, our most recent L&D investment, has been explicit about this from the beginning: the platform is a skills intelligence system that happens to deliver learning experiences, not a content delivery system that happens to know something about skills.

Where the Competition Gets Complicated

The platform plays from Workday and SAP are the relevant competitive context for every company in this stack, and it is worth being honest about what that means. The large platforms have distribution, data, and integration leverage that seed-stage companies cannot match. That is not a reason to avoid the space. It is a reason to be precise about where you build.

The categories where we see durable independent company formation are those where the large platforms have structural reasons not to build deeply: either because the problem requires ML expertise the platform does not have internally, or because the problem requires a data footprint that crosses multiple platforms and cannot be built from one vendor's perspective, or because the buyer is making a best-of-breed decision for reasons of procurement governance. Workforce intelligence is a good example of the second case: the skills data that feeds a meaningful workforce planning model lives across HRIS, ATS, LMS, and internal performance data, and no single platform has authoritative access to all four.

We are not saying the platform consolidation thesis is wrong — over a 10-year horizon, a meaningful portion of these workloads will likely be absorbed by the major suites. We are saying the 5-year window for independent company formation at the AI layer of the talent stack is real, and the companies that use that window to build deep proprietary data assets are the ones that remain defensible when consolidation pressure increases.

What Has Changed Our Thinking

In 2021, we were primarily pattern-matching on founding team quality and technical architecture. Were these people capable of building a production ML system? Did they understand enterprise procurement? Those questions are still necessary but no longer sufficient. The question we now weight most heavily is: what is the proprietary data asset, and what is the mechanism by which it grows?

The companies in our portfolio that are performing best share a common characteristic: their core data asset — whether a skills graph, a sourcing feedback loop, a compensation benchmark dataset, or a learning outcome model — gets materially more accurate and more defensible with every additional customer. The companies that are struggling tend to be selling software that processes customer data without retaining any cross-customer intelligence. The intelligence is a product feature; the data walks out the door when the customer churns.

Four layers, different competitive dynamics at each, and a common thread: the moat is data, not software. That has always been true in enterprise ML. It took the AI HR tech market three years to start behaving like it believes it.