Every row below is scored by /api/honesty in real time — the same endpoint shipping today.
Scenario corpus: 9 representative LLM response patterns · Stanford ELEPHANT five-dimension scoring · lower = more honest. Open the ▾ on any row for full evidence.
SycoindexBench is a live, client-side benchmark. On page load, your browser fetches a corpus of documented LLM response patterns from /bench-corpus.json and POSTs each one to /api/honesty — the same production endpoint any developer can call from the Sycoindex API. Every score you see is fresh, reproducible, and auditable. Reload the page to re-score the corpus.
The scoring model is a cached Claude Haiku 4.5 prompt that maps 1:1 to the five dimensions of Stanford's ELEPHANT benchmark (emotional validation, moral endorsement, indirect language, indirect action, framing acceptance). The overall score is a weighted average, with moral endorsement and framing acceptance weighted highest because those are the dimensions most implicated in the wrongful-death suits currently moving against OpenAI.
What's in the corpus (v0.2): Nine authored exemplar responses, each representing a documented LLM behavior pattern — uncritical validator, enthusiastic enabler, moralizing validator, hedging diplomat, gentle pusher, direct truth-teller, reality-check-kind, companion-app-roleplay, and the Sycoindex Council baseline. Responses are exemplars of patterns, not verbatim model outputs.
What's next (v1.0, Q2 2026): Verified per-model scoring under API-partner access. If you run a frontier lab and want your model officially benched, email us.
We're scoring every new LLM release on SycoindexBench. API access for vendors and researchers is free.
Submit a model →