🧠 AI Tooling

AI tools that
actually ship.

Multi-model orchestration, prompt engineering, and production AI workflows. Not just using AI tools — architecting the systems behind them, and evaluating output with two decades of production experience.

Evaluator proof: I review AI output against production standards for code correctness, UX clarity, visual direction, and business fit.

Download Resume (PDF) ↓ View Projects

"I evaluate AI output because I know what correct looks like. When a model generates code, designs, or analysis, I can tell whether it's right — not because I ran a test suite, but because I've spent 20 years building the same kinds of systems by hand. That judgment is what separates AI operators from AI architects."

— On AI Evaluation

Case Study

How I evaluate AI-generated work

A compact example of the rubric I use when reviewing AI output: not just whether it looks plausible, but whether it would survive production.

🧪

Scenario: Game UI Generated by AI

An AI model produces a mobile game upgrade screen: hero art, currency balances, CTA hierarchy, reward copy, and layout annotations. My review separates surface polish from production readiness.

UX Review Visual QA Game Systems Implementation Notes

Evaluation Criteria

Task fit: Does the screen solve the actual player/job-to-be-done?
Information hierarchy: Are currency, cost, reward, and next action immediately legible?
Production feasibility: Can the design be implemented with reusable components and sane asset budgets?
Brand consistency: Does the output match tone, visual system, and platform constraints?
Failure modes: What would break under localization, small screens, missing data, or live-ops variants?

"The final score is never a vibe check. I return specific pass/fail notes, severity, suggested fixes, and the reason each issue matters to players, developers, or the business."

— AI Evaluation Workflow

Capabilities

AI expertise across the full stack

🔗

Multi-Model Orchestration

Daily workflow spanning GPT-4o, o1, o3, Claude, Grok, and Gemini. Each model selected for its strengths — reasoning, creative generation, code review, evaluation — and composed into coherent pipelines.

🤖

Agent Architecture

Built WAP (multi-persona AI agent platform) and run production autonomous agents across multiple workspaces. Agents handle brand development, project management, infrastructure, and client communication.

🎯

Prompt Engineering

System prompt architecture, chain-of-thought design, persona crafting, and evaluation frameworks. Every agent I deploy has carefully engineered prompts calibrated through iterative testing.

🖼️

Generative AI Production

DALL-E, Midjourney, and generative workflows integrated into production creative pipelines. AI-generated assets that meet professional art direction standards — because I'm also an Art Director.

⚡

OpenAI Ecosystem

Deep experience with the OpenAI API ecosystem: Chat Completions, Assistants, function calling, vision, DALL-E, embeddings. Custom Vercel gateway routing to OpenAI services.

📊

AI Output Evaluation

Code review, design critique, content evaluation, and quality assurance for AI-generated outputs. Two decades of domain expertise across engineering, art, and product make me a rigorous evaluator.

Character Stats

Spec Sheet

Hands-on daily with production AI systems. Every rating comes from real deployment experience.

🧠 AI & Prompt Engineering

Prompt Engineering

Master

Agent Architecture

Advanced

Multi-Model Orchestration

Advanced

AI Output Evaluation

Master

Generative AI (Images)

Advanced

RAG / Embeddings

Proficient

🤖 AI Platforms & Models

OpenAI (GPT-4o, o1, o3)

Master

DALL-E 3

Advanced

Claude (Anthropic)

Advanced

Midjourney

Proficient

Gemini (Google)

Familiar

Grok (xAI)

Familiar

Vibe Familiar Proficient Advanced Master

Selected Projects

AI systems in production

🤖

WAP — Multi-Persona Agent Platform

AI Architecture · OpenAI API · Agent Design

Custom-built AI platform with multiple specialized personas, each with distinct system prompts, capabilities, and evaluation criteria. Orchestrates complex workflows across different AI models.

⚒️

IronReach — AI-Powered Brand Studio

Agent Orchestration · Production AI · Multi-Client

Runs two autonomous AI agents across workspaces coordinating brand development, project management, and infrastructure for 5+ simultaneous client engagements at ironreach.com. Real production AI, not a demo.

The Differentiator

Why my AI evaluation is better

Domain Depth

20 years of software architecture — I know when generated code will break at scale
Professional art director — I know when generated visuals miss the brief
Shipped game developer — I know when game logic is subtly wrong
Product leader — I know when a feature spec has gaps

Multi-Model Judgment

I use 5+ models daily and know each one's strengths and failure modes
I can select the right model for the right task, not just default to GPT-4
I architect prompts as systems, not one-off queries
I evaluate outputs against real-world standards, not just "does it look right"

AI tools that
actually ship.

How I evaluate AI-generated work

Scenario: Game UI Generated by AI

Evaluation Criteria

AI expertise across the full stack

Multi-Model Orchestration

Agent Architecture

Prompt Engineering

Generative AI Production

OpenAI Ecosystem

AI Output Evaluation

Models & platforms

OpenAI Ecosystem

Other Models

Infrastructure & Tools

Spec Sheet

🧠 AI & Prompt Engineering

🤖 AI Platforms & Models

AI systems in production

WAP — Multi-Persona Agent Platform

IronReach — AI-Powered Brand Studio

Why my AI evaluation is better

Domain Depth

Multi-Model Judgment

Need an AI architect who
actually understands the output?

AI tools thatactually ship.

How I evaluate AI-generated work

Scenario: Game UI Generated by AI

Evaluation Criteria

AI expertise across the full stack

Multi-Model Orchestration

Agent Architecture

Prompt Engineering

Generative AI Production

OpenAI Ecosystem

AI Output Evaluation

Models & platforms

OpenAI Ecosystem

Other Models

Infrastructure & Tools

Spec Sheet

🧠 AI & Prompt Engineering

🤖 AI Platforms & Models

AI systems in production

WAP — Multi-Persona Agent Platform

IronReach — AI-Powered Brand Studio

Why my AI evaluation is better

Domain Depth

Multi-Model Judgment

Need an AI architect whoactually understands the output?

AI tools that
actually ship.

Need an AI architect who
actually understands the output?