This work is mentored by Sasikiran Kandula and Jeffrey Shaman in the Department of Environmental Health Sciences at Columbia University.
My work involved:
- Analyzing large datasets (~68 million) and exploring machine learning methods in MySQL and R to track real-time influenza incidence combining both diagnostic and virologic data in electronic medical records.
- Comparing and Contrasting different machine learning methods (Boosting, SVM, Random Forest, etc) using cross-validation.
- Creating clear and compelling reports, visualizations, and interactive apps for collaborators in R Markdown.
Additional Resources
Final Report: This is my paper for Fall 2017 course: P9120 Topics in Statistical Learning & Data Mining taught by Min Qian.
Slide: This is my slide for Fall 2017 course: P9120 Topics in Statistical Learning & Data Mining taught by Min Qian.