Ferrox Labs · Where AI meets the road

§ 01

Research areas

I.

Applied AI quality

The empirical study of why AI products fail in deployment when they pass evaluation. Active lines of inquiry include confidence calibration at the interface layer; latency as a perceived-quality signal; the relationship between recovery affordance and user trust; and methodology for adversarial testing of composite model-and-interface systems.

II.

Multi-model adversarial review

Methodology for verification of AI-generated artefacts through structured review by multiple independent systems. Our framework, the AI Quality Trident, is being formalised for publication.

III.

Foundation model adaptation

The technical study of where pretrained models diverge from useful behaviour in specific deployment conditions. We work on open-weights and frontier models, with attention to the gap between leaderboard performance and behaviour against real users.

IV.

Agentic systems and infrastructure

Long-horizon autonomous AI and the cross-platform infrastructure required to operationalise it. Our open framework, IJFW, underpins this work and is in active development across eight AI coding agents.

V.

Cognitive systems

Long-horizon research into the architecture of intelligence and the emergent properties that arise from it. Several lines of work in this area are powerful enough that we have not yet decided how, or whether, they should be released. Findings will appear in peer-reviewed venues when they are ready.

§ 02

Methodology

Ferrox Labs operates under a research-and-engineering discipline developed across twenty-six years of independent practice. Every artefact released by the laboratory, from a modified model checkpoint to a deployed tool, passes through the same verification process: pre-implementation acceptance criteria, multi-model adversarial review at every phase transition, and adversarial human testing under conditions designed to make the artefact fail.

We do not publish work we cannot defend in review.

We do not release tools we would not use ourselves.

We do not put our name on capability we have not stress-tested under adversarial conditions.

The discipline is documented in full in a forthcoming book on AI craft.

§ 03

In the forge

Original research, recurring benchmark series, and notes from the laboratory. We publish what we find. Two reports are available now.

Research Report May 2026 · 30pp · PDF

Put Your Agents to Work

Findings from twelve primary research sources on AI agent adoption, deployment, and ROI. Introduces the Agent Maturity Ladder, the Ferrox Labs methodology (Decompose, Cross-Audit, Validate), five validated workflow blueprints, security and legal readiness, and a 30-day deployment protocol.

Download the report →

Forge Edition May 2026 · 11pp · PDF

The Forge: Official AI Rankings

Independent benchmark across coding, agentic, reasoning, and practical categories with explicit methodology and contamination flags. Sources include Scale AI SEAL, Artificial Analysis, OpenAI, Anthropic, Google DeepMind, DeepSeek, Moonshot AI, BenchLM, HuggingFace, and official model cards.

Open the May 2026 edition →

In development

A book on AI craft

A forthcoming book documenting the engineering discipline that decides whether AI delivers in production. Built on twenty-six years of independent practice. Working title held back until release.

In development

Further research releases

Framework references, working papers, and field notes from the laboratory. Released when defensible under independent review. No schedule stated, none implied.

§ 04

The laboratory

Ferrox Labs is led by Sean Donahoe, originator of the empathy gap construct for AI quality, the Donahoe Loop development methodology, the AI Quality Trident multi-model verification system, and the Agent Maturity Ladder for diagnosing organisational readiness for AI deployment.

His twenty-six-year career spans Silicon Valley and the Texas tech sector, with a consulting record for Fortune 500 companies on systems they could not stabilise internally. The laboratory sits at the intersection of data science, agent orchestration, and operational methodology. Ferrox Labs is part of Foundry AI.

The laboratory is hiring. Quietly. jobs@ferroxlabs.com