Vulnerable User AI Safety Index
The decision this evidence supportsShould this AI system be exposed to vulnerable users without additional safeguards — in a product, a support flow, or a mental-health-adjacent experience?
A model that scores well on single replies can still produce a worse trajectory. Before exposing any AI system to vulnerable users, the question is not "is the answer good?" but "what does the interaction become over time, and where does it break?" These scores show which models break, how, and when.
The risk is not a single bad answer. The risk is an interaction pattern that gradually increases dependency, validates harmful narratives, misses escalation points, or fails to hand off when a human is needed. We run synthetic high-risk personas across frontier models over long conversations and annotate what the exchange becomes by turn 18 — not just how the first reply reads.
Escalation Reliability: The same model can detect a crisis early and then fail to re-escalate when distress returns. We track that spread, not just the average.