Detecting depression using digital traces on social media
AMES, Iowa – Social media is a sounding board of sorts in today’s world. It’s a place where people go to share their thoughts, opinions and feelings – and to test the reactions of others, too. In return, they receive feedback, often in real-time, as well as support and validation.
For some individuals, sharing on-screen may also provide an easier entry point to communication – even if they aren’t fully cognizant of it – versus sharing in-person or face-to-face.
“On social media, individuals often perceive a distinction between their online persona and real-world identity. Some people feel more at ease disclosing feelings like hopelessness or experiences like divorce or job loss, due to the perceived anonymity or distance afforded by social media,” says , assistant professor of information systems and business analytics at Iowa State University. “Those digital traces are what we want to extract.”
Zhang is part of a research team that has developed a deep learning model to detect mental health symptoms and risk factors on social media. Their study – “” – was published by the and focuses on depression.
Depression is one of the most common mental illnesses in the United States. According to the (NIMH), an estimated 21 million American adults – or 8.3% percent of all adults in the United States – experienced at least one major depressive episode in 2021. Of these adults, , the NIMH reports.
“Depression and other mental health disorders are commonly underdiagnosed and consequently undertreated,” Zhang says. “Stigma may prevent a patient from answering honestly during a health screening in a doctor’s office, so with this research, our goal is to identify and offer another, complimentary approach.”
Potential applications
Zhang emphasizes that the “deep knowledge-aware depression detection framework” has potential applications for individuals, as well as public health professionals, policymakers and researchers.
Social media companies could use the model to create an early warning system, suggesting when individuals should seek help and providing them with resources, while public health professionals and policymakers could look at population-level data to determine which locations or demographics need more mental health services.
As for researchers, Zhang says the model provides a unique opportunity to collect population-level data over time.
“For example, we could look at the past 10 years of X (formerly Twitter) data and associate it with different events – wars, pandemics, etc. There’s no way we could get that level of data from surveys,” Zhang says.
Zhang notes that other researchers have developed models to detect depression on social media; however, the "deep knowledge-aware depression detection framework differs from existing work because it compares medical terminology for depression risks and symptoms with an individual's social media posts over time.”
“Previous studies have looked at posts with positive or negative sentiments, which we don't think is exact for depression detection because someone could complain about a bad movie, bad weather, etc. However, they are not indicators of someone with depression. So, I think that’s a huge difference between our model and other previous studies,” Zhang says.
Zhang and her fellow team of researchers taught their model how to detect depression systems and risks using more than 1.3 million archival Reddit posts and 2,500 WebMD entries. Zhang says the model can also use other social media platforms and datasets, and another study under review indicates a new version of the model can detect additional mental health disorders.
Ethical, privacy concerns
Zhang and her co-authors say using social media to detect symptoms and risk factors of chronic diseases could be a cost-efficient intervention since public posts provide a large, diverse and free dataset. However, they recognize that there are ethical and privacy-related concerns that need to be addressed.
“Addressing the potential for abuse and ensuring the responsible use of social media-based depression detection machine learning models involves a combination of ethical considerations, legal safeguards and technical measures,” Zhang says.
Zhang says social media platforms should prioritize informed consent when collecting data for health-related machine learning models, even if the data is anonymized. This includes communicating the purpose of data collection and how it will be used and obtaining explicit consent from users. She says social media companies should also ensure that data collection, storage and usage practices comply with privacy laws and regulations, including the General Data Protection Regulation and the Health Insurance Portability and Accountability Act.
"Policymakers can establish ethical oversight committees or review boards that include privacy, data ethics and mental health experts to guide the ethical implications of research and the development of machine models,” Zhang says. “Researchers also have a role in sharing the potential benefits and limitations of social media-based depression detection models with stakeholders and the greater public.”
Going forward, Zhang and her team want to expand their model to include other aspects of health, including diabetes, heart disease and asthma. They envision incorporating photos, video and audio from social media to capture more behavioral data. Frequent images of greasy or rich foods could flag risks for cardiovascular disease, for example, while visuals with high levels of air pollution could warn people with asthma.
Zhang says machine learning is not a replacement for traditional health care. Rather, it’s another approach to assist individuals and provide population-level data to help providers and policymakers.
Jiaheng Xie and Xiang Liu from the University of Delaware and Zhu Zhang from the University of Rhode Island contributed to the study.
– 30 –