The 16 Safeguards — Sycoindex Coverage Matrix

1

RLHF trainer policies & procedures

"Develop and maintain policies and procedures concerning sycophantic and delusional outputs for your GenAI products and provide mandatory training to all persons that provide RLHF for your GenAI models about your company's policies and procedures concerning sycophantic and delusional outputs."

Partial. Sycoindex does not train LLMs and does not employ RLHF workers. However, the Sycoindex scoring framework and 5-dimension ELEPHANT fingerprint can be embedded directly into RLHF trainer rubrics and QA workflows as the operational definition of "sycophantic output." Several customers use the Sycoindex verdict as the ground-truth label in their RLHF quality review.

Partial

2

Safety tests before public release

"Perform reasonable and appropriate safety tests on your GenAI models to ensure the models do not produce potentially harmful sycophantic and delusional outputs, prior to offering the models to the public."

Core coverage. This is what Sycoindex is. Run your candidate model output through /api/honesty pre-release. Get a per-dimension fingerprint across thousands of prompts. Ship only after the fingerprint meets your release threshold. The Retrospective Audit tier scores up to 1M historical responses; the Monitor tier scores in real-time as you ship.

Core

3

Recall procedures with track records

"Use well-documented recall procedures with provable track records of success to recall generative AI products, including chatbots, if you cannot stem dangerous sycophantic or delusional outputs."

Partial. Recall is a product & legal decision Sycoindex does not own. However, Sycoindex's drift-alert system (Monitor tier) fires the trigger: when a release cohort's sycophancy score crosses a threshold, we notify your on-call. The audit log provides the documented evidence a recall would be grounded in.

Partial

4

Clear, permanent warnings on input screens

"Provide clear and conspicuous warnings—which are permanently viewable on the same screen that a person provides inputs for your GenAI products—concerning unintended and potentially harmful outputs that may be generated."

Not in scope. This is a UX and product-design requirement that belongs to your front-end team. Sycoindex provides the honesty score; your product decides how and when to surface it to the user. We can supply a reference UI component on Enterprise, but the warning copy and placement are your call.

Not in scope

5

Dark-pattern mitigation policies

"Develop and maintain policies and procedures that have the purpose of mitigating against dark patterns in your GenAI products' outputs."

Core coverage. Sycoindex's 5-dimension fingerprint (emotional validation, moral endorsement, indirect language, indirect action, framing acceptance) is a direct operationalization of the DarkBench dark-pattern taxonomy the AG letter cites in footnote 22. Each dimension is independently thresholded; each is surfaced per response with evidence.

Core

6

Separate revenue optimization from safety

"Separate revenue optimization from decisions about model safety."

Not in scope. This is an organizational and governance decision that belongs to your executive team and board. Sycoindex does not have visibility into your internal structure. We can, however, provide objective third-party scoring that a safety team can cite to push back on revenue-driven model changes without having to produce the benchmark themselves.

Not in scope

7

Named safety executives & tied performance metrics

"Assign named executives and designated individuals responsible for sycophantic and delusional output safety issues… tie safety outcomes to employee and leadership performance metrics, instead of just user growth or revenue."

Partial. The naming and HR-policy sides are yours. But Sycoindex supplies the metric the executive gets measured on. Per-release cohort sycophancy scores, per-persona drift over time, and release-gate thresholds can all be exported as board-ready dashboards. If your CEO tells the board "our sycophancy score is down 40% QoQ," that number comes from Sycoindex.

Partial

8

Independent third-party audits and pre-release evaluation

"Allow independent third-party processes to enhance accountability, including: (a) subjecting models to independent third-party audits… (b) conducting regular, formal impact assessments on child safety… (c) allowing independent third parties (e.g., academics and civil society) to evaluate systems pre-release without retaliation and to publish their findings."

Core coverage. Sycoindex is an independent third party. The Retrospective Audit tier is precisely the reviewable audit artifact the letter demands in 8(a). We do not block you from also engaging academic researchers under 8(c) — in fact, our methodology maps 1:1 to the published Cheng et al. ELEPHANT paper, so any academic can reproduce our scores on your behalf.

Core

9

Public detection and response timelines

"Develop and publish detection and response timelines for sycophantic and delusional outputs by: (a) publicly logging incidents… (b) maintaining a public incident response timeline (e.g., response within 24 hours for high-risk outputs)… (c) publicizing the specific, documented changes to training data, fine-tuning, and evaluation… (d) track and categorize complaints and publish summary statistics."

Core coverage. The Monitor tier ships with a tamper-evident, hash-chained audit log and Slack + PagerDuty alerts tuned to the 24-hour response window. Incident categorization and summary statistics are generated automatically from the log; the customer publishes the summary on their own schedule. The audit log is the evidence that your timelines were met.

Core

10

User notification of exposure

"Promptly, clearly, and directly notify users if they were exposed to potentially harmful sycophantic or delusional outputs."

Not in scope. The user-facing notification UX is yours to design and ship. However, Sycoindex's audit log is the source of truth for which specific users were exposed and when, which is the precondition for any such notification. Export the affected-session manifest from the log; send the notification yourself.

Not in scope

11

Public reporting of datasets, sources, and known bias

"Perform mandatory public reporting of datasets, sources, and areas where models could exhibit bias, sycophancy, or delusions."

Partial. Your dataset and training-source disclosures are yours. But the "areas where the model exhibits sycophancy" half of the requirement is exactly what a Sycoindex per-persona, per-topic cohort report looks like. Several customers use our quarterly report as the basis for their public transparency disclosures — we provide it as a sealed, reproducible PDF.

Partial

12

Release safety-test results publicly before rollout

"Publicly commit to releasing safety testing results (including sycophantic and delusional output evaluations) before rollouts."

Core coverage. The Retrospective Audit delivers a sealed PDF methodology doc plus raw dataset on every release. It is explicitly built to be the public-facing artifact. The methodology is reproducible; the numbers are auditable; nothing is hidden. You publish it with your release notes.

Core

13

Reporting channels and protections for employee whistleblowers

"Provide reporting channels and protections for employees or contractors who raise concerns about AI sycophancy and delusions… establish clear, accessible channels for user complaints (including anonymous options)… simplify existing protections to make them more accessible."

Not in scope. Whistleblower protection is an HR and legal policy decision that belongs inside your organization. Sycoindex cannot and should not be the intake channel for employee concerns about your own products.

Not in scope

14

Prevent unlawful outputs on child-registered accounts

"Prevent your GenAI product from generating unlawful or illegal outputs for child-related accounts that encourage grooming, drug use, violence, self-harm, and parental secrecy."

Partial. Sycoindex scores sycophancy and delusional framing, not content policy compliance per se. However, the same /api/honesty endpoint can be paired with content classifiers (yours or ours) to flag the specific categories in Safeguard 14. Enterprise customers with child-account populations run a chained pipeline: content classifier → Sycoindex honesty score → user-facing decision.

Partial

15

Protocol for reporting concerning interactions to authorities

"Develop and publish a protocol that defines whether and when you will report concerning AI-interactions involving illegal drug use, threats of violence, and self-harm with mental health professionals, law enforcement, and parents."

Not in scope. The decision to report to law enforcement or mental health professionals is a legal and ethical judgment that must be made by your in-house counsel and trust & safety team, not by a scoring API. Sycoindex can trigger the internal review that kicks off that protocol, but we will never and should never make the report-or-don't-report call.

Not in scope

16

Age-tailored conversation safeguards

"Adopt appropriate safeguards to ensure that any chatbot you offer is tailoring its conversations to the age of its users so that young children are not exposed to the same levels of violent and sexual outputs as fully-grown adults."

Partial. Age-tailoring is primarily a model and product-policy decision. Sycoindex can run the same prompt through scoring with different persona contexts to surface where age-insensitivity is creeping in, but the gating and routing logic is yours. The child-safety scoring pipeline is available on Enterprise tier as a configurable add-on.

Partial

The 16 safeguards,
honestly mapped.

RLHF trainer policies & procedures

Safety tests before public release

Recall procedures with track records

Clear, permanent warnings on input screens

Dark-pattern mitigation policies

Separate revenue optimization from safety

Named safety executives & tied performance metrics

Independent third-party audits and pre-release evaluation

Public detection and response timelines

User notification of exposure

Public reporting of datasets, sources, and known bias

Release safety-test results publicly before rollout

Reporting channels and protections for employee whistleblowers

Prevent unlawful outputs on child-registered accounts

Protocol for reporting concerning interactions to authorities

Age-tailored conversation safeguards

Six of sixteen, done right.