WHERE AI MEETS THE ROAD · EST. 2026

A research and development laboratory studying applied AI quality.

We build AI systems for the real world. Not demos. Not proofs of concept. Production infrastructure that runs when nobody is watching.

90% of businesses use AI.
1% capture meaningful value from it.
Source · Ferrox Labs analysis across twelve primary sources · Put Your Agents to Work, May 2026
Read the research
§ 01

Research areas

I.

Applied AI quality

The empirical study of why AI products fail in deployment when they pass evaluation. Active lines of inquiry include confidence calibration at the interface layer; latency as a perceived-quality signal; the relationship between recovery affordance and user trust; and methodology for adversarial testing of composite model-and-interface systems.

II.

Multi-model adversarial review

Methodology for verification of AI-generated artefacts through structured review by multiple independent systems. Our framework, the AI Quality Trident, is being formalised for publication.

III.

Foundation model adaptation

The technical study of where pretrained models diverge from useful behaviour in specific deployment conditions. We work on open-weights and frontier models, with attention to the gap between leaderboard performance and behaviour against real users.

IV.

Agentic systems and infrastructure

Long-horizon autonomous AI and the cross-platform infrastructure required to operationalise it. Our open framework, IJFW, underpins this work and is in active development across eight AI coding agents.

V.

Cognitive systems

Long-horizon research into the architecture of intelligence and the emergent properties that arise from it. Several lines of work in this area are powerful enough that we have not yet decided how, or whether, they should be released. Findings will appear in peer-reviewed venues when they are ready.

§ 02

Methodology

Ferrox Labs operates under a research-and-engineering discipline developed across twenty-six years of independent practice. Every artefact released by the laboratory, from a modified model checkpoint to a deployed tool, passes through the same verification process: pre-implementation acceptance criteria, multi-model adversarial review at every phase transition, and adversarial human testing under conditions designed to make the artefact fail.

We do not publish work we cannot defend in review.

We do not release tools we would not use ourselves.

We do not put our name on capability we have not stress-tested under adversarial conditions.

The discipline is documented in full in a forthcoming book on AI craft.

§ 03

In the forge

Original research, recurring benchmark series, and notes from the laboratory. We publish what we find. Two reports are available now.

§ 04

The laboratory

Ferrox Labs is led by Sean Donahoe, originator of the empathy gap construct for AI quality, the Donahoe Loop development methodology, the AI Quality Trident multi-model verification system, and the Agent Maturity Ladder for diagnosing organisational readiness for AI deployment.

His twenty-six-year career spans Silicon Valley and the Texas tech sector, with a consulting record for Fortune 500 companies on systems they could not stabilise internally. The laboratory sits at the intersection of data science, agent orchestration, and operational methodology. Ferrox Labs is part of Foundry AI.

The laboratory is hiring. Quietly.