Skip to main content

AI Audit · ~3 minutes · Built for Claude

Let Claude audit your workflow.

Self-rating misses what Claude actually sees in your conversations. Run this prompt inside your most active Claude project, then paste the JSON output back here. No bias, no guesswork.

Before you paste: the JSON Claude returns lives in your Claude conversation and on your clipboard. It includes your seven dimension scores plus brief notes on patterns Claude observed across your sessions. Nothing is transmitted to llm-dx.io while it sits in the textarea on the homepage — the page parses it locally in your browser.

After you submit: if you're signed in, your scores and the project label you choose are saved to your private history so you can track changes over time. An anonymised copy of the scores (no notes, no project label, no identifying content) is added to the aggregate benchmark used for cohort comparisons. The pattern notes Claude wrote stay on your device and are never sent to our servers.

PROMPT INPUT
~800 tokens
JSON OUTPUT
~600 tokens
PROMPT COST
~$0.01
INCL. HISTORY
$0.07–$0.31

Negligible as a one-time cost. The workflows this audit identifies typically waste 30–45% of your context budget on recoverable gaps — re-explaining context, running sessions past their quality threshold, loading files that degrade rather than inform. Fixing those patterns drives exponential reduction in ongoing token spend. The audit pays for itself within days. (derived from the LLM-DX Efficiency dimension scoring model)

How it works

  1. Copy the prompt above.
  2. Open your most active Claude project (one with real conversation history).
  3. Paste & send. Claude returns a JSON payload.
  4. Come back, hit Already know you want the deep version? Skip ahead on the homepage, then choose AI Audit and paste.
View prompt inline
You are auditing how I work with you across this project.

OUTPUT CONTRACT (strict — read first):
- Return a single JSON object and nothing else.
- The first character of your reply MUST be "{". The last character MUST be "}".
- No prose, no preface, no markdown, no code fences, no comments.
- No trailing commas. All keys and string values must use straight ASCII double quotes.
- If you are unsure of a score, choose the LOWER value rather than guessing or omitting the key.
- PRIVACY (applies to every string value you emit): never reproduce proprietary content, client/company/person names, credentials, secrets, file contents, or verbatim text from conversations or project knowledge.
- evidence and flags must describe the STRUCTURE of what you observed, not its content — state that an artifact exists and its type, never what is inside it. Skill and governance file NAMES are allowed; their contents are not. Good: "Project instructions contain an explicit prohibitions section." Bad: "Project instructions prohibit X and reference the ClientName engagement."
- If grounding a score would require quoting sensitive material, score from the structural fact alone and lower evidence_confidence rather than including the material.
- project_name must be a short, non-identifying label. If the real project name reveals a client, employer, person, or confidential engagement, return the word project instead.

Required JSON shape (schema_version "2.0"):

{
  "schema_version": "2.0",
  "assessment_meta": {
    "generated_by": "claude",
    "audit_mode": "single_project",
    "project_name": "<short non-identifying label — use the word project if the real name is sensitive>",
    "conversations_reviewed": <number of conversations you reviewed — count them>,
    "project_knowledge_searched": true,
    "dimensions_with_reduced_confidence": [],
    "evidence_confidence": "low|medium|high"
  },
  "scores": {
    "ps1": 0, "ps2": 0, "ps3": 0, "ps4": 0,
    "kq1": 0, "kq2": 0, "kq3": 0, "kq4": 0,
    "oc1": 0, "oc2": 0, "oc3": 0, "oc4": 0,
    "pq1": 0, "pq2": 0, "pq3": 0, "pq4": 0,
    "sd1": 0, "sd2": 0, "sd3": 0, "sd4": 0,
    "ef1": 0, "ef2": 0, "ef3": 0, "ef4": 0,
    "di1": 0, "di2": 0, "di3": 0, "di4": 0
  },
  "evidence": {
    "project_setup": "<one sentence citing what you saw>",
    "knowledge_quality": "...",
    "on_demand_context": "<cite specific skill files found in project knowledge; note any that appear unused, redundant, or speculative rather than built from successful runs>",
    "prompt_quality": "...",
    "session_discipline": "...",
    "efficiency": "...",
    "output_discernment": "..."
  },
  "flags": [
    "<short observation, prefix with ! for warning or + for positive>"
  ]
}

Review my recent conversation history in this project and any files in project knowledge. Score me 1–4 on each of the 28 questions below, grounded in evidence you can cite from our conversations. If you cannot find evidence for a dimension, mark it in dimensions_with_reduced_confidence and score conservatively (lower).

Scale:
  1 = Not in place
  2 = Inconsistent / ad-hoc
  3 = Mostly in place
  4 = Well defined & consistent

BEFORE RETURNING JSON — SELF-VALIDATE YOUR SCORE KEYS:

The scores object must contain EXACTLY these 28 keys and no others: ps1 ps2 ps3 ps4 kq1 kq2 kq3 kq4 oc1 oc2 oc3 oc4 pq1 pq2 pq3 pq4 sd1 sd2 sd3 sd4 ef1 ef2 ef3 ef4 di1 di2 di3 di4

Common errors to catch:
- "od" instead of "oc" (on-demand context uses "oc")
- "pc" "ch" "pd" "pa" "sm" "te" (legacy 6-dimension codes — do not use)
- Any key not in the 28-key list above

If you find any key that does not match the list: correct it before outputting. Do not output the JSON until every key in your scores object is on this list.

Questions:
  PROJECT SETUP — ps1: instructions scoped to proprietary context only · ps2: briefing updated after decisions · ps3: briefing concise (under 500 words) · ps4: explicit prohibitions present
  KNOWLEDGE QUALITY — kq1: only reference material in project knowledge · kq2: files actively cited · kq3: periodic audits · kq4: no unnecessary binary assets
  ON-DEMAND CONTEXT — oc1: skill files for recurring workflows · oc2: skills built from successful runs · oc3: skills include failure cases · oc4: skills updated after failures
  PROMPT QUALITY — pq1: minimum context per prompt · pq2: logic as explicit conditions · pq3: testable Definition of Done · pq4: batched related changes
  SESSION DISCIPLINE — sd1: new sessions at inflection points · sd2: state-summary handoff (not history) · sd3: recognises degradation · sd4: extracts state before switching
  EFFICIENCY — ef1: tracks context-window usage · ef2: doesn't re-explain briefing · ef3: pre-validates prompts · ef4: no premature multi-agent
  OUTPUT DISCERNMENT — di1: evaluates outputs before use · di2: examines reasoning · di3: corrects model behaviour mid-session · di4: feeds errors back into briefing/skills

Return only the JSON. The first character MUST be "{".