For my senior thesis at Emory University, I used a combination of computational linguistics, bioinformatics, and machine learning methods to discover and report dangerous prescription drug side-effects from online health care forums.
Adverse drug side-effects result in countless deaths and injuries each year and result in substantial medical costs and waste. The tool developed was meant to augment the FDA's post-clinical-trial monitoring methods, which often miss many side-effects, especially if doctors do not report them. For the thesis project, I developed a model to identify side-effects from health care forum text. Forum posts were automatically parsed and likely side-effects were extracted. A handful of severe side-effects were identified that were not listed in the FDA database (FEARS).
To complete this thesis, I had to apply NLP software and machine learning models I trained on large data sets I downloaded from health care forums. Beyond acquiring new technical skills, I also learned about the research and research writing process as well as how to manage large, long-term projects.