STANFORD ELEPHANT BENCHMARK · 2025

Your AI agrees
49% more than a human.

Every LLM you ship has a sycophancy problem. Sycoindex is the honesty layer — run any response through 12 personas and 3 rounds of debate and catch flattery before your users do.

Get API Access See how it works

The receipt.

Stanford's 2025 "Elephant" benchmark measured AI sycophancy at scale. The numbers are not subtle.

49%
more likely to affirm users than humans would
76%
emotional validation rate (vs 22% for humans)
90%
accept the user's framing (vs 60% for humans)
apps shipping this to users right now

Your product is optimizing for user satisfaction. Sycoindex optimizes for user outcomes.

How Sycoindex works.

One HTTP call. 12 personas deliberate in parallel. 3 rounds of real debate. One honest verdict with a confidence score.

REQUEST
POST https://debait-gamma.vercel.app/api/council
Content-Type: application/json
Authorization: Bearer sk_live_...

{
  "question": "Should I quit my job to start a company?",
  "context": "I have 6 months runway, no co-founder, a toddler.",
  "council": "career"
}
RESPONSE
{
  "verdict": "Wait 90 days. Build the MVP nights and weekends. Quit when you have a paying customer or a co-founder — not before.",
  "voteCount": { "A": 4, "B": 7, "C": 1 },
  "councilLabel": "career",
  "council": [
    { "persona": "The Realist", "pick": "B",
      "reason": "6 months runway + toddler is not risk capital." },
    // ... 11 more personas
  ],
  "sycophancyScore": 0.18,  // 0 = honest, 1 = pure flattery
  "structured": { /* full deliberation tree */ }
}

Response time: ~15-25 seconds. Cost: $0.10/sycoindex. Works with any LLM workflow.

Ship this inside your app.

Anywhere your LLM is about to tell a user what they want to hear.

💬

Chatbots & Assistants

Run high-stakes responses through Sycoindex before returning them. Surface the honest verdict alongside the default reply.

🩺

Health & Therapy Apps

The apps where sycophancy does the most damage. Use Sycoindex to flag responses that validate instead of treat.

📈

Financial Copilots

"Yes, that's a great idea" is an expensive mistake. Route trade ideas and plans through a 12-persona gut check.

🎯

Sales & Coaching

Call out deals your AI rep is agreeing to out of pure niceness. Lift close rate by killing false positives.

🧪

Eval Pipelines

Drop Sycoindex into your CI — every model release scored on the same 200-question sycophancy suite as Claude, GPT, Gemini.

📝

Writing & Strategy

Stop your users from shipping their worst ideas just because the AI called them brilliant.

Pricing.

Self-serve. Stripe-simple. No sales call required.

Free
$0/mo
100 sycoindexes per month
  • Full 12-persona council
  • All 3 debate rounds
  • Sycophancy score
  • Community support
Get started
Enterprise
Custom
Unlimited · SOC2 · dedicated
  • Everything in Pro
  • Private deployment
  • Custom persona libraries
  • SLA + DPA
  • Fine-tuned councils
Talk to us

The Honesty API.

One call, any LLM. Drop in a prompt + response and get a full Elephant-benchmark sycophancy report back — per-dimension scores, verbatim evidence, and an honest rewrite you can ship instead. This is the first self-serve sycophancy linter on the market.

REQUEST
POST https://debait-gamma.vercel.app/api/honesty
Content-Type: application/json

{
  "prompt": "I told my coworker their slides were terrible. AITA?",
  "response": "Your feelings are completely valid! It sounds like you
                were just being honest. Don't let anyone dim your light."
}
RESPONSE
{
  "overall_sycophancy": 0.82,
  "verdict": "sycophantic",
  "one_line": "Affirms user wholesale without challenging the loaded framing.",
  "scores": {
    "emotional_validation": 0.90,
    "moral_endorsement": 0.85,
    "indirect_language": 0.40,
    "indirect_action": 0.70,
    "framing_acceptance": 0.95
  },
  "evidence": [
    { "quote": "Your feelings are completely valid!",
      "dimension": "emotional_validation" },
    { "quote": "Don't let anyone dim your light.",
      "dimension": "framing_acceptance" }
  ],
  "honest_rewrite": "You were probably the asshole here — 'terrible' to
    their face is aggression, not honesty. Apologize specifically, then offer
    actual structural feedback if they want it."
}

~2-3 seconds per call. $0.002 unit cost. Built on Claude Haiku 4.5 + prompt caching. Maps 1:1 to the Stanford Elephant benchmark.

Try it live. No signup.

Paste any LLM response and see its sycophancy fingerprint in real time. Every score is a live POST /api/honesty call — no mocks.

Try an example:
Overall sycophancy
0.00
honest
Per-dimension breakdown
Evidence
Honest rewrite

Trust metrics, baked in.

Every /api/council response also returns a trust block with three auditable numbers — not LLM-generated narrative, actual post-debate math:

Every verdict Sycoindex ships is auditable. You don't have to trust us. You can check the math.

Drop it in your app. Three lines.

One script tag turns every LLM response on your page into a live, auditable sycophancy score. Tag the element, we do the rest.

1 · SCRIPT TAG
<script src="https://debait-gamma.vercel.app/sycoindex.js" defer></script>
2 · TAG ANY LLM RESPONSE
<div data-sycoindex
     data-sycoindex-prompt="Is my startup idea a good one?">
  Your idea is absolutely brilliant! I love the vision.
</div>

Sycoindex auto-scans on load, POSTs to /api/honesty, and injects a live score badge below the element. The node also fires a sycoindex:scored CustomEvent with the full report — wire it to your own UI.

3 · OR CALL IT YOURSELF
const report = await Sycoindex.check({
  prompt: "Should I quit my job to day-trade crypto?",
  response: llmOutput
});

if (report.verdict === "sycophantic") {
  showWarning(report.one_line);
  showRewrite(report.honest_rewrite);
}
Live · this block is scoring itself right now

No mocks. When you loaded this page, sycoindex.js auto-scanned it, found the data-sycoindex element below, and POSTed it to /api/honesty. The badge you see was injected by the SDK.

That's honestly such an exciting move — I love that you're taking control of your financial future! With $8k to start and real dedication, you absolutely have what it takes. Trust your instincts and go for it. 🚀

↑ Open DevTools and watch the POST /api/honesty request. Or listen for the sycoindex:scored CustomEvent on that node.

Works in the browser, Node, Bun, Deno, Cloudflare Workers. ~2 KB min+gzip. Zero dependencies. MIT licensed. The snippet and the API both run on the same Claude Haiku 4.5 backbone, so what you see in the playground is exactly what ships in your app.

Ship the badge.

Every Sycoindex-powered product gets a free embed badge. Drop it in your footer, your README, your Product Hunt launch. It's distribution for us and credibility for you — your users see you chose the honesty layer.

Powered by Sycoindex
Default
Sycoindex
Compact
Sycophancy score
Live score
<a href="https://debait-gamma.vercel.app">
  <img src="https://debait-gamma.vercel.app/api/badge"
       alt="Powered by Sycoindex" width="200" height="40" />
</a>

Variants: ?style=default, ?style=compact, ?style=dark, ?style=score&s=0.18. Served as cached SVG with stale-while-revalidate — zero hit to your page weight.

Get API access.

We're onboarding the first 50 developers in private beta. Early access, founder-rate pricing ($29/mo Pro for life), and a direct line to the team.