Experience

Current

Medidata : Data Scientist

June 2020 - Present
  • Member of the Platform Data Sciences team working on the Detect product, a series of microservices written in R and run on AWS.
  • Built and maintain a Shiny app to help speed up manual testing of Detect product.


Past

DataCamp : Data Scientist, Product

March 2019 - December 2019
  • Developed metrics to guide product team and individual product squads.
  • Built and maintained dashboard of product metrics, plus own relevant product sections of other company dashboards.
  • Maintained and developed product-related data views in company data lake, built with R or Redshift SQL and executed using CircleCI and Airflow.
  • Conducted ad-hoc analyses for squads and product research on user behavior.
  • Designed and analyzed product A/B tests with product managers and company-wide experimentation team.

DataCamp : Content Quality Analyst

May 2018 - March 2019
  • First onboarded member of the Content Quality team.
  • Responsible for maintaining production Shiny dashboard used both internally and externally (instructor-facing) that tracks course quality metrics.
  • Worked with instructors as needed to perform quality-related maintenance on existing courses.

MedStar Health Institutes for Innovation : R Big Data Developer, Contract

April 2017 – July 2018
  • Responsible for building and debugging data analysis pipelines (ETL) and instantiating them as APIs, testing with PostMan, and maintenance as needed when other use cases become apparent.
  • Created research paper templates which allow for testing of large amounts of EMR data to address medical questions brought in by researchers.
  • Consulted on deep learning projects and data pipelines for other MedStar projects as needed.

HERE Technologies : Data Scientist, Quality Testing and Statistics Team

June 2017 – May 2018
  • Main data scientist representative of the Quality Testing and Statistics team. Served as an internal consultant for other teams in the company, advising on experimental design and training data collection for planned machine learning models and advising on best practices in auditing and evaluation quality of models.
  • Built R&D prediction models to test the quality of existing production models inside the company and provided insights into how to optimize the real-world testing quality process.
  • Responsible for synthesis of large datasets requiring many joins, extensive cleaning of messy data collected in the field, drafting production-level prediction models, and creation of quality scores.
  • Wrote spec for data synthesis and reports on analyses, with goal of educating non-data science audience in the company on how to interpret and use results of quality tests.

American College of Surgeons : Statistician, Continuous Quality Improvement

June 2016 – May 2017
  • Conducted risk-adjusted hierarchical modeling with coded hospital data to carry out surgical program quality assessment for MBSAQIP (bariatric surgery) and TransQIP (transplant surgery) data.
  • Cleaned datasets for public release for research, including removing all protected health information, checking logic, and flagging errors.
  • Advised on best practices for statistical design and analysis of studies, including an integral role in the launch of a nationwide Enhanced Recovery in Bariatric Surgery project, with expertise on survey design, survey scoring, and data collection methods.
  • Consulted on UX design of front-end data collection tools.

Louisiana Tumor Registry : Biostatistician

January 2015 – May 2016
  • Designed and conducted statistical analyses and data modeling efforts for research projects focusing on cancer diagnosis and treatment trends using National Cancer Database (NCDB) data from the American College of Surgeons (ACS), focused on breast, pancreatic, and colorectal cancers (presently, with more projects to come).
  • Responsible for extensive research and rewrites of the cancer Facts & Figures program booklets in order for distribution for a more general audience, running cancer statistics as necessary in SEERStat with Survival, Epidemiology and End Results (SEER) data.
  • Published 2 papers using univariate and multivariable survival analysis methods plus exploratory data analysis methods.


Education

Louisiana State University Health Sciences Center

Master of Science in Biostatistics

August 2014 – May 2016
Advisor: Dr. Qingzhao Yu
Thesis title: General multiple mediation analysis to examine ethnic differences in anxiety and depression in cancer survivors using the MY-Health survey. A version of it was published in Psychometrika in April 2017.

Eastern Michigan University

Post Baccalaureate education in Math & Biology

September 2012 – June 2014

University of Michigan

Bachelor of Arts, Screen Arts and Cultures & Women’s Studies

September 2007 – December 2011


Publications