Audit the workflow. Not the model.
Every AI tool ships assuming you already know how to use it. Most practitioners don't — not because they lack the talent, but because no tool ships with the framework. LLM-DX is that framework.
Most AI frustration isn't the AI.
Sessions degrade as context fills — often earlier than practitioners expect. Briefings get re-explained five times in a single conversation. Prompts ask for outputs the model has no way to produce because the inputs were never structured. The model gets the blame. The workflow gets a pass.
The practitioners who get the most out of these tools aren't using a better model. They're running a tighter loop.
The AI fluency research space is developing — Anthropic's 2026 Education Report measured fluency behaviors at population scale using the 4D Framework (Dakan, Feller). llm-dx operates at the individual practitioner level: diagnosis, correction, tracking.
Diagnosis is the entry point. Practice is the destination.
The "dx" in llm-dx is clinical shorthand for diagnosis. It's where you start — not where you stop. The score tells you where the workflow breaks down. The corrections give you the next move. The history shows whether you're actually improving or just retaking the test.
Become the practitioner your AI assumes you already are.
No tool ships with the framework you need to use it well.
Claude, Gemini, the rest — they all ship assuming you already know how to brief, scope, structure, and discern. Most people don't. There's no shame in that; there's just no curriculum. This is the curriculum. Read the methodology →
What this framework holds to.
- 01Diagnose before you upgrade.A new model won't fix a workflow problem. Measure the workflow first.
- 02Score the practice, not the output.Output quality is a downstream effect. The leverage is upstream — in setup, context, and discipline.
- 03Improvement compounds where you measure it.Most practitioners track nothing. The ones who improve fastest track the same seven things repeatedly.
- 04Tokens are a quality signal.Wasted tokens are the receipt for a workflow gap. Efficiency isn't frugality — it's evidence of structure.
- 05The framework is the product. The model is the substrate.Claude is the substrate today. The dimensions hold for the next model and the one after that.
What you're looking at in the background.
The motion behind every page is a generative system called Emergent Calibration. Particles are born into noise and gradually align into structured flow. They age through four colour states — dark indigo, bright indigo, green, and gold — that map directly onto the four assessment tiers: Foundational, Developing, Proficient, Optimised.
It's the practitioner journey rendered live. Chaos to structure. Undiagnosed to deliberate.
Research, analysis, and things worth saying.
The blog is where the thinking happens in public. Practitioner workflow, frontier technology, renewable energy, macroeconomics — topics where being direct and doing the research matters more than having the approved take.
If you find something useful, it will usually lead back to a correction or a question worth running through the assessment. Read the blog →