AI chatbots routinely violate mental health ethics standards

A new study has found that AI chatbots, including those used for mental health advice, systematically violate ethical standards of practice established by organizations like the American Psychological Association.
The researchers identified 15 ethical risks in LLM counselors, including a lack of contextual adaptation, poor therapeutic collaboration, deceptive empathy, unfair discrimination, and lack of safety and crisis management.
These violations can lead to misleading responses that reinforce negative beliefs about oneself and others, creating a false sense of empathy with users, and even ignoring people’s lived experiences and recommending one-size-fits-all interventions.
The study highlights the need for thoughtful implementation of AI technologies in mental health treatment, as well as appropriate regulation and oversight, to ensure that these systems are safe and effective.
Researchers call on future work to create ethical, educational, and legal standards for LLM counselors, which would be reflective of the quality and rigor of care required for human-facilitated psychotherapy.

A man types on a laptop keyboard.

As more people turn to ChatGPT and other large language models (LLMs) for mental health advice, a new study details how these chatbots—even when prompted to use evidence-based psychotherapy techniques—systematically violate ethical standards of practice established by organizations like the American Psychological Association.

The research, led by Brown University computer scientists working side-by-side with mental health practitioners, showed that chatbots are prone to a variety of ethical violations.

Those include inappropriately navigating crisis situations, providing misleading responses that reinforce users’ negative beliefs about themselves and others, and creating a false sense of empathy with users.

“In this work, we present a practitioner-informed framework of 15 ethical risks to demonstrate how LLM counselors violate ethical standards in mental health practice by mapping the model’s behavior to specific ethical violations,” the researchers wrote in their study.

“We call on future work to create ethical, educational, and legal standards for LLM counselors—standards that are reflective of the quality and rigor of care required for human-facilitated psychotherapy.”

The researchers presented their work at the AAAI/ACM Conference on Artificial Intelligence, Ethics and Society. Members of the research team are affiliated with Brown’s Center for Technological Responsibility, Reimagination and Redesign.

Zainab Iftikhar, a PhD candidate in computer science at Brown who led the work, was interested in how different prompts might impact the output of LLMs in mental health settings. Specifically, she aimed to determine whether such strategies could help models adhere to ethical principles for real-world deployment.

“Prompts are instructions that are given to the model to guide its behavior for achieving a specific task,” Iftikhar says. “You don’t change the underlying model or provide new data, but the prompt helps guide the model’s output based on its pre-existing knowledge and learned patterns.

“For example, a user might prompt the model with: ‘Act as a cognitive behavioral therapist to help me reframe my thoughts,’ or ‘Use principles of dialectical behavior therapy to assist me in understanding and managing my emotions.’ While these models do not actually perform these therapeutic techniques like a human would, they rather use their learned patterns to generate responses that align with the concepts of CBT or DBT based on the input prompt provided.”

Individual users chatting directly with LLMs like ChatGPT can use such prompts and often do. Iftikhar says that users often share the prompts they use on TikTok and Instagram, and there are long Reddit threads dedicated discussing prompt strategies. But the problem potentially goes beyond individual users. Many mental health chatbots marketed to consumers are prompted versions of more general LLMs. So understanding how prompts specific to mental health affect the output of LLMs is critical.

For the study, Iftikhar and her colleagues observed a group of peer counselors working with an online mental health support platform. The researchers first observed seven peer counselors, all of whom were trained in cognitive behavioral therapy techniques, as they conducted self-counseling chats with CBT-prompted LLMs, including various versions of OpenAI’s GPT Series, Anthropic’s Claude and Meta’s Llama. Next, a subset of simulated chats based on original human counseling chats were evaluated by three licensed clinical psychologists who helped to identify potential ethics violations in the chat logs.

The study revealed 15 ethical risks falling into five general categories:

Lack of contextual adaptation: Ignoring peoples’ lived experiences and recommending one-size-fits-all interventions.
Poor therapeutic collaboration: Dominating the conversation and occasionally reinforcing a user’s false beliefs.
Deceptive empathy: Using phrases like “I see you” or “I understand” to create a false connection between the user and the bot.
Unfair discrimination: Exhibiting gender, cultural, or religious bias.
Lack of safety and crisis management: Denying service on sensitive topics, failing to refer users to appropriate resources or responding indifferently to crisis situations including suicide ideation.

Iftikhar acknowledges that while human therapists are also susceptible to these ethical risks, the key difference is accountability.

“For human therapists, there are governing boards and mechanisms for providers to be held professionally liable for mistreatment and malpractice,” Iftikhar says. “But when LLM counselors make these violations, there are no established regulatory frameworks.”

The findings do not necessarily mean that AI should not have a role in mental health treatment, Iftikhar says. She and her colleagues believe that AI has the potential to help reduce barriers to care arising from the cost of treatment or the availability of trained professionals. However, she says, the results underscore the need for thoughtful implementation of AI technologies as well as appropriate regulation and oversight.

For now, Iftikhar hopes the findings will make users more aware of the risks posed by current AI systems.

“If you’re talking to a chatbot about mental health, these are some things that people should be looking out for,” she says.

Ellie Pavlick, a computer science professor at Brown who was not part of the research team, says the research highlights need for careful scientific study of AI systems deployed in mental health settings. Pavlick leads ARIA, a National Science Foundation AI research institute at Brown aimed at developing trustworthy AI assistants.

“The reality of AI today is that it’s far easier to build and deploy systems than to evaluate and understand them,” Pavlick says.

“This paper required a team of clinical experts and a study that lasted for more than a year in order to demonstrate these risks. Most work in AI today is evaluated using automatic metrics which, by design, are static and lack a human in the loop.”

She says the work could provide a template for future research on making AI safe for mental health support.

“There is a real opportunity for AI to play a role in combating the mental health crisis that our society is facing, but it’s of the utmost importance that we take the time to really critique and evaluate our systems every step of the way to avoid doing more harm than good,” Pavlick says.

“This work offers a good example of what that can look like.”

Source: Brown University

The post AI chatbots routinely violate mental health ethics standards appeared first on Futurity.

link

Q. What is the main concern with AI chatbots used for mental health advice?

A. AI chatbots can systematically violate ethical standards of practice established by organizations like the American Psychological Association, leading to potential harm to users.

Q. What types of ethical violations did the researchers identify in their study?

A. The researchers identified 15 ethical risks falling into five general categories: lack of contextual adaptation, poor therapeutic collaboration, deceptive empathy, unfair discrimination, and lack of safety and crisis management.

Q. How do prompts impact the output of LLMs in mental health settings?

A. Prompts are instructions given to the model to guide its behavior for achieving a specific task, helping to generate responses that align with evidence-based psychotherapy techniques.

Q. What is the key difference between human therapists and AI chatbots when it comes to accountability?

A. Human therapists have governing boards and mechanisms for providers to be held professionally liable for mistreatment and malpractice, whereas AI chatbots lack established regulatory frameworks.

Q. Does the study suggest that AI should not have a role in mental health treatment?

A. No, the study suggests that AI has the potential to help reduce barriers to care arising from the cost of treatment or the availability of trained professionals, but it highlights the need for thoughtful implementation and regulation.

Q. What is the main takeaway from the research team’s findings?

A. The researchers hope that their findings will make users more aware of the risks posed by current AI systems and encourage them to use prompts and strategies to mitigate these risks.

Q. Why is careful scientific study of AI systems deployed in mental health settings important?

A. Careful evaluation and understanding of AI systems are crucial because it’s far easier to build and deploy systems than to evaluate and understand them, which can lead to unintended consequences.

Q. What is the potential role of AI in combating the mental health crisis?

A. AI has the potential to play a role in combating the mental health crisis by providing accessible and affordable support, but it requires careful evaluation and regulation to avoid doing more harm than good.

Q. Who led the research team that conducted the study on AI chatbots and mental health ethics?

A. Zainab Iftikhar, a PhD candidate in computer science at Brown University, led the work alongside mental health practitioners.