- The Hemingway Report
- Posts
- #42: Impact of Gen-AI tool on therapy outcomes
#42: Impact of Gen-AI tool on therapy outcomes
plus two other interesting studies you should know about

Hi friends,
We’ve had a lot of new people join the community lately. Welcome! It’s great to have you all here.
My North Star is to improve population mental health.
I try to do that by supporting people like you (yes, you!) to run commercially and clinically successful organisations.
This means I spend my time analysing the mental health market, understanding the trends that are shaping it and then sharing all of that juicy info with you.
A lot of the time I focus on the commercial and product side of the mental health world (like last week’s earnings breakdown).
But I also share relevant research with you. It’s hard to stay on top of the latest findings and to understand what’s relevant for your business. So I try and do it for you :)
That’s what this article is all about.
I share some of the most interesting research papers I’ve read lately and discuss the “so-what” for people building in the mental health space.
We discuss:
A new, real-world study on the impact of a Gen-AI tool on therapy outcomes
A great study analysing the mental health treatment gap across 21 countries, showing that only 7% of people get effective care
A study in Nature demonstrating that consumer devices can effectively assess and monitor cognitive health remotely
Let’s get into it.
THR Pro
If you want to join a community of mental health founders, executives and investors, consider becoming a THR Pro member.
As a THR Pro member, you’ll get access to additional insights and data each month, including my most recent deep-dives on what we can learn from the 2024 earnings season, whether AI will lead to the unbundling of therapy and The Eight Major Trends Shaping the Mental Health Tech Industry in 2024.
We also run monthly events to bring the community together and discuss the most interesting topics in this space. It’s fun and I’d love for you to be a part of it.
1. Improving therapy outcomes with Gen-AI
What is it?
This study examined the impact of Limbic Care, an AI-powered digital therapy assistant on Cognitive Behavioral Therapy (CBT) outcomes in a real-world UK NHS group therapy setting.
Why is this study important?
We know that CBT is an effective treatment for depression and anxiety disorders. Getting people to engage with relevant material and exercises in between their therapy sessions is a key determinant of the success of this treatment. But it’s also a major barrier - a lot of people just don’t do it. This study wanted to see if using a generative AI-enabled therapy support tool would improve engagement leading to better patient adherence and improved treatment success.
How did it work?
The study analyzed 244 patients in a multi-site, real-world, observational setting. It compared those who chose to use the AI tool with those using standard paper-based workbooks.
The AI-support tool used was Limbic Care’s mobile app, featuring a conversational chatbot to assist with the completion of materials assigned by the therapist.

Screenshots of Limbic Care used in the study
Key Findings?
61.5% of patients chose the AI-enabled tool
Patients using the AI assistant attended on average 2 more therapy sessions than the control, with a 23 percentage point lower drop-out rate and 15 percentage point lower missed session rate.
The intervention group had 21 percentage points higher reliable improvement and 21 percentage points higher reliable recovery rates compared to control
Greater engagement with the AI tool correlated with a greater number of completed therapy sessions and better outcomes, reinforcing that continued support between sessions improves adherence to therapy.
Completing more CBT exercises, relative to psychoeducational materials, predicted better treatment adherence and was associated with an improvement in treatment success.
When users were asked about their experience, they highlighted that the AI tool helped them think through their issues and gain better perspectives as well as in developing coping skills and implementing CBT strategies in their everyday lives.
So What?
People are open to using AI tools with the majority of people choosing to use the AI-enabled app. User sentiment is not a barrier to adoption.
Personalisation is key. Personalisation drives better outcomes and gen AI is very well suited to develop this. If you’re not thinking about personalisation, you need to be.
Therapy 2.0 is here. I believe we are heading to a world where therapists are supported by multiple AI agents. This is clear evidence that this structure can be effective in improving outcomes. I wrote about this in-depth a few weeks ago if you want to read more.
Retention in mobile apps is still a challenge. By week 3, 51% of initial users were still engaged with the app. By Week 6, this was 19%. Getting people to stay engaged with an app is still really, really, hard!
Evidence is gold. This evidence will be a huge help to Limbic and the broader ecosystem of businesses building AI to support therapy. Those companies who have robust evidence bases will be better able to convince payers and providers to adopt their product.
Costs might go up before they do down! One of the primary reasons this worked was because it increased adherence to therapy and the number of sessions patients attended. This increases cost for payers in the short term. Yes, the improvement in outcomes should lead to longer-term cost savings, but providers of these products will need to get over the expected short-term cost increase with payers. This will be easier under VBC agreements.
Link to the study: https://www.jmir.org/2025/1/e60435. Table 3 has some great qualitative feedback from users that I found very insightful - it’s worth a read.
2. Quantifying the treatment gap
What is it?
This study examined the proportion of mental and substance use disorders receiving guideline-consistent treatment in multiple countries. It aimed to understand the proportion of people receiving effective treatment and what affects that metric.
Why is this study important?
We need to increase the effectiveness of mental health treatments, but if these treatments are not reaching the people who need them, and if they are not completing those treatments, then we will still have a huge gap in outcomes.
This study looked at the proportion of people who actually receive effective treatment for mental disorders. It also analysed the determinants of this measure, giving us some insight into how we might develop solutions.
As the authors noted; “To our knowledge, this is the only systematic cross-national estimate of effective treatment for mental and substance use disorders based on primary, high-quality epidemiological data”.
How did it work?
The researchers conducted epidemiologic surveys with over 56,000 adults, across 21 countries, from 2001 until 2019. They assessed the prevalence and treatment of 9 DSM-IV anxiety, mood, and substance use disorders.
They also analysed 19 country-level variables, and 3 sets of individual-level variables to understand what was predictive of people getting the right level of treatment.
Key findings?
Only 6.9% received effective treatment. That’s a wild and deeply disturbing number.
The rates of effective treatment differed significantly across disorders, with 12% of those with Generalised Anxiety Disorder receiving effective treatment, but only 1.4% for those with Alcohol Use Disorder.
“Low perceived need” was the most significant barrier to effective treatment.
On an individual level;
Women were 50% more likely to receive effective treatment than men, despite men having more than twice the rates of substance use disorder prevalence and suicide death rate than women.
People with higher levels of education were more likely to receive adequate treatment.
Family income was not associated with increased rates of effective treatment.
On a country level;
Effective treatment was positively correlated with country-level resources and some elements of healthcare spending
Interestingly, country-level stigma and discrimination, in comparison, were not significant correlates of effective treatment
So what?
The main takeaway here is that we have a massive amount of work to do to increase the proportion of people who receive effective treatment for their mental disorders. This study gives a great quantitative analysis of the treatment gap and how it occurs at each stage of a patient’s journey. So what does this study teach us about what we must do to improve?
Focus on increasing perceived need. Across all disorders, only 46% of people with a disorder perceived that they needed treatment. Of those, people, an average of 34% had contact with some level of care. These are two huge drop-offs that must be addressed. I did some research to try and understand what influences perceived need but didn’t find many results. Would be very keen to hear how people think about addressing this.
Continue to train primary care professionals in recognising and treating mental disorders. This supports anyone building in the integrative or collaborative care space.
Focus on underserved populations, specifically the gap between the level of care delivered for men, and those with lower levels of education. I’ve been saying it for a while but I think there’s a significant opportunity for businesses to deliver more population-focused strategies, specifically for men and elderly people. Most of the major online therapy brands are subtly feminine and their usage data reflects that. This data shows that men need treatment at similar if not higher rates to women, but don’t engage with it for various reasons. Businesses should think about the opportunity of addressing this problem.
Link to the study: https://jamanetwork.com/journals/jamapsychiatry/fullarticle/2829593?
3. Remote assessment of cognitive health
What is it?
This study was a large, virtual, observational study of 12,004 US adults that aimed to detect mild cognitive impairment (MCI) based on interactive and passive remote data collected through iPhones and Apple Watches.
Why is this study important?
Early intervention and precision medicine hold a lot of potential in mental and brain health. But to implement these effectively we need accurate, scalable and affordable methods to actually monitor our brain health. This study was a large-scale study with collaboration from Apple and Biogen to try and understand if they could remotely detect Mild Cognitive Impairment (MCI).
While MCI is primarily associated with brain health disorders such as Alzheimer’s Disease, it also has applications in Mental Health. There is also a lot to be learned from their research methods in order to ensure predictive models of health are not biased.
How did it work?
Over eighteen months the study enrolled a demographically diverse population with a focus on middle-to-late adulthood and a spectrum of risk for decline in cognition. This was done using “an adaptive, multichannel recruitment strategy based on the ‘Decentralised clinical trial’ framework”. In short, they used a wide range of channels (like email, referrals and app-store sponsorship) to reach a diverse population.
Then, they collected three main types of data from this population;
Core demographic data (age, sex, education level)
Subjective measures of mild cognitive impairment, (CFI and E-Cog) - Self-reported measures of cognitive concerns, essentially asking users about their own concerns
Objective measures of mild cognitive impairment, using baseline CANTAB (Cambridge Neuropsychological Test Automated Battery) outcomes which users completed in the app.
The goal was to see if they could build a model that could predict mild cognitive impairment based on these remotely collected measures. Essentially: could the model pinpoint those in the group that had clinically diagnosed mild cognitive impairment?
Key findings?
The model performed quite well. The subjective and objective interactive cognitive assessments, along with core demographic variables, could predict whether a person has mild cognitive impairment. The exact numerical performance was that the AUROC was 0.85, i.e., the model had an 85% chance of ranking a randomly chosen positive instance higher than a randomly chosen negative instance. It is substantially better than random guessing (which would give an AUROC of 0.5).
The study did a great job of recruiting a diverse and representative population (something a lot of studies fail to do). 31.5% of the total study population were from under-represented races or ethnicities, higher than what is typically seen in clinical research and trials. Two-thirds of the study enrolled using focused email campaigns based on diverse demography, as well as word-of-mouth, which was important for the recruitment of under-represented populations by race/ethnicity.
So what?
This study proves that remote multimodal and cognitive sampling with digital devices could screen and track cognitive health in a scalable manner. There is significant opportunity to apply this concept to broader mental and brain health, especially given the importance of early detection and prevention.
Removing bias in research is super important for the advancement of precision medicine. We all accept that the ‘big data’ required to train and test models needs to be able to be generalised to the greater population. This is particularly true in illnesses that tend to disproportionately affect certain populations that are often not the primary participants of key studies. As businesses look to develop more models for prediction and treatment, they can learn a lot from how this study recruited a diverse user base.
Although they collected a huge amount of passive data (like standing data, sleep tracking, keyboard metrics, sound and speech detection, facial metrics and more), they noted that exploring passive data for cognitive impairment detection "remains to be determined and will be the focus of future work". Unfortunately, it looks like the study was ended early due to a change in Biogen’s program priorities, so I’m not sure if we will ever see this “future work” come to fruition.
That’s all for this week. As always, feel free to email me with any questions or comments.
Keep fighting the good fight!
Steve
Founder of The Hemingway Group
P.S. Connect with me on LinkedIn if you haven’t already
P.P.S. If you want to become a THR Pro member, you can learn more here.
Reply