The Stanford Replika Study: What It Actually Found

If you have read anything about AI companions and mental health, you have probably read about the Stanford-affiliated Replika study. It is the most-cited paper in this space, the one journalists default to, and the one most often oversimplified into a headline.

This piece is the plain-language summary. What the study examined, what it found, what it did not find, what its limits are, and what it means for someone deciding whether and how to use an AI companion app.

If you have time for one paragraph: the study surveyed Replika users (a college-student population) and found that a meaningful proportion reported reduced loneliness from using the app. A smaller subset reported that the app had helped during periods of suicidal ideation. The researchers were careful with the framing; the headlines were less so. The paper does not claim Replika is a treatment for depression, does not generalize beyond the sampled population, and does not address long-term outcomes. It does establish that for some users in some situations the app is doing real work for them.

What the study is

The paper, by Bethanie Maples and colleagues, was published in the open-access journal npj Mental Health Research in 2024. It surveyed users of Replika, focused on a college-student population, and asked about loneliness, perceived social support, and (in a notable section) whether the chatbot had ever played a role in moments of suicidal ideation.

The full paper is open-access and available via npj Mental Health Research. We strongly recommend reading the original if you are using this work for any consequential purpose; what follows is a summary, and summaries lose nuance.

What the study found

Three main results, in plain language.

Reduced loneliness for many users. A meaningful proportion of respondents reported that their use of Replika reduced their experience of loneliness. The effect was large enough to be statistically significant and consistent enough to be treated as a real effect rather than a fluke.

Perceived social support. Many users described the app as providing something resembling social support. The phrasing matters: “perceived” social support is what the paper measures, not whether the app is a functional replacement for human relationships. Users felt supported. Whether the support is the same kind of thing as human support is a different question the study does not try to answer.

Suicide-related effects in a subset. A subset of respondents reported that their use of Replika had played a role during periods of suicidal ideation, with some describing the app as part of what got them through that period. This is the finding that drove most of the headlines and is the place to be most careful. The paper does not claim Replika is a suicide-prevention tool, does not study clinical effectiveness, and is reporting user-described experiences rather than measuring outcomes against a control group. The finding is striking and worth taking seriously; the framing matters.

What the study does not say

Five things worth being clear about.

It does not say AI companions are a treatment for depression, anxiety, or any clinical condition. The study surveys self-reports; it does not run a controlled trial; it does not measure clinical outcomes. Treating it as evidence that AI companions treat clinical conditions overstates what the research supports.

It does not generalize beyond the sampled population. The participants were college students. Whether the findings apply to older adults, to clinical populations, to users in different cultural contexts, to long-term users versus short-term users, are all open questions the paper does not address.

It does not measure long-term outcomes. The study captures user experience at a point in time. Whether sustained AI companion use improves or worsens loneliness over years is not what this paper measures. The displacement question (does AI companion use reduce real human connection over time?) is also not what this paper addresses.

It does not establish causation cleanly. Surveys of self-selected users tell you what those users say; they do not establish what would have happened to those users without the app. The methodology limits how strong a causal claim can be drawn.

It does not speak to the differences between apps. The paper studies Replika specifically. Whether Kindroid, Nomi, Character.AI, or other apps produce similar or different effects is not what this work addresses.

What it does say, accurately

For some users, in some situations, AI companion use is doing emotional work that those users describe as meaningful and in some cases as having played a role in their mental-health management. The effect is large enough to take seriously. The mechanism is unclear. The generalization is uncertain. The longer-term picture is open.

This is not a small finding. For a category that critics often dismiss as inherently harmful or pointless, the paper provides empirical evidence that something real is happening for many users. It is also not the larger finding the headlines have sometimes implied. The careful version is more useful than either dismissal or breathless coverage.

What this means for users

A few practical implications for someone deciding whether and how to use an AI companion app.

If you are dealing with acute loneliness and considering a companion app, the study supports the use as plausibly helpful. It does not certify any specific app, but it tells you that the use case is one the research has documented.

If you are dealing with a clinical condition (depression, anxiety, suicidal ideation), the study does not tell you to substitute an AI companion for clinical care. The careful read is that AI companions can play a supportive role for some users in those circumstances; the careful action is to work with a clinician and treat the AI companion as adjunct.

If you are evaluating an AI companion app and considering whether to pay for it, the study does not tell you which app to pick. It studies Replika specifically; we cover the broader picture in our mainstream rankings.

If you are a journalist or researcher citing this work, please read the original and avoid the simplified versions that have circulated. The paper is more careful than its coverage and the careful version is more useful.

The broader research context

The Stanford Replika study sits alongside other research worth knowing about.

The MIT Media Lab’s Companion Chatbots and Loneliness project is an ongoing large-scale study that will produce more longitudinal data when it publishes. Skjuve and colleagues in Norway have done extensive qualitative interview work on Replika user experiences. Pentina, Xie, and others have approached AI companions through consumer-research and marketing-research frames. De Freitas at Harvard has published industry-skeptical work documenting specific harm patterns.

The picture across this literature is consistent with the Stanford finding: for some users in some situations, AI companions are doing real emotional work; the longer-term and population-level questions are open; the generalizations need to be made carefully.

We covered the broader research landscape in AI Companions and Mental Health.

Where to read it

The paper is open-access and available via npj Mental Health Research at nature.com. The full citation, abstract, methodology, and supplementary materials are all freely available. We strongly recommend the original over any summary including ours when the stakes warrant it.

FAQ

Is the Stanford Replika study peer-reviewed?

Yes. npj Mental Health Research is a peer-reviewed open-access journal published by Springer Nature.

Does this mean I should use Replika?

No. The study finds effects in some users; it does not endorse the app or recommend any particular use. We have moved on from recommending Replika for new users since the 2023 ERP removal; see Replika Alternatives in 2026 for our current view.

Does this study say AI companions can prevent suicide?

The study reports user-described experiences in which the app played a role during periods of suicidal ideation. It does not claim AI companions prevent suicide as a clinical matter. The careful read is that the experience is real for some users; the careful action remains to involve a clinician for any serious concern.

Why is this study so much more cited than the others?

It was the first large-scale piece of research on a specific consumer AI companion app to be published in a serious peer-reviewed journal. It also had findings that traveled well in coverage, for better or worse.

Is there research that contradicts this?

Not directly contradicting. Other work documents harm patterns and concerning use cases (De Freitas et al. is the strongest example). The picture across the literature is “this can help some users in some situations and can also harm others in other situations” rather than a uniform positive or negative.

AI Companions and Mental Health for the broader research backdrop.

AI Companions for Loneliness for the practical implications of this research.

Replika Alternatives in 2026 for our current view on Replika specifically.

If you are a researcher in this area and we got something wrong, please write us at the contact form. Corrections are made quickly; reviews are not.