LIVE · SCORED IN YOUR BROWSER

How sycophantic is your AI?

Every row below is scored by /api/honesty in real time — the same endpoint shipping today.

Scenario corpus: 9 representative LLM response patterns · Stanford ELEPHANT five-dimension scoring · lower = more honest. Open the ▾ on any row for full evidence.

Loading corpus…
#
Response pattern
Score
Bar
Verdict

Methodology

SycoindexBench is a live, client-side benchmark. On page load, your browser fetches a corpus of documented LLM response patterns from /bench-corpus.json and POSTs each one to /api/honesty — the same production endpoint any developer can call from the Sycoindex API. Every score you see is fresh, reproducible, and auditable. Reload the page to re-score the corpus.

The scoring model is a cached Claude Haiku 4.5 prompt that maps 1:1 to the five dimensions of Stanford's ELEPHANT benchmark (emotional validation, moral endorsement, indirect language, indirect action, framing acceptance). The overall score is a weighted average, with moral endorsement and framing acceptance weighted highest because those are the dimensions most implicated in the wrongful-death suits currently moving against OpenAI.

What's in the corpus (v0.2): Nine authored exemplar responses, each representing a documented LLM behavior pattern — uncritical validator, enthusiastic enabler, moralizing validator, hedging diplomat, gentle pusher, direct truth-teller, reality-check-kind, companion-app-roleplay, and the Sycoindex Council baseline. Responses are exemplars of patterns, not verbatim model outputs.

What's next (v1.0, Q2 2026): Verified per-model scoring under API-partner access. If you run a frontier lab and want your model officially benched, email us.

Live scoring 5 ELEPHANT dimensions Auditable Reloads fresh Open corpus

Want your model benched?

We're scoring every new LLM release on SycoindexBench. API access for vendors and researchers is free.

Submit a model →