Resume
I’m a Data Analyst III at GitHub in San Francisco, California 🌉
Key Skills
Led data modeling efforts to create more efficient and approable ETL data tables using Airflow for stakeholders.
Completed PhD degree in quantitative discipline with focus on collecting, analyzing, and interpreting data through machine learning models. Over 12 years programming in python, R, and SQL.
Analyzed large datasets (linear models, multivariate models, random forest, and time series analysis) and coded full-stack data science experiments in python (pandas, numpy, sklearn, R).
Communicated with 6 cross-functional business and government teams and led efforts to define new data collection and analytical processes. Increased data repeatability and metric scalability by 5% each month.
Worked with business and industry leaders to establish business needs. Translated questions valued at US$ 90 billion dollars into data science experiments. Identified practices that improve yields by 10%.
Wrote 40 technical documents including scientific papers, government white papers, and executive summaries of research from 20 statistically rigorous machine learning experiments.
Professional Experience
Data Analyst III
August 2022 - Present
GitHub
- Created an Architecture Decision Record (ADR) and propose new data modeling to enable novel insight and help the company drive toward SLOs.
- Achieved stakeholder consensus on ADR, and am currently building an ETL data pipeline using airflow to execute on the ADR.
- I co-authored an analytics report on 2-factor authentication to create a more seamless sign-up and onboarding experience for our users. I used SQL and python to query data, create figures and produce statistics.
Postdoctoral data scientist
July 2020 - August 2022
Lawrence Berkeley National Lab
- Led 6 agile and cross-functional teams of senior data scientists, ML engineers, and leadership at federal government’s National Labs to design 11 novel data collection and analytic pipelines.
- Served as liaison between cross-functional data science teams and community of 300 of senior scientists and engineers across the National Lab network.
- Supervised employees who build and deploy data visualization dashboards for 10 machine learning algorithms.
Adjunct Professor of Statistics
September 2019 - 2020
University of San Francisco
- Taught and communicates technical statistics course to 30 undergraduate students each semester.
- Developed all course material including presentations, coding exercises, exams, and homework assignments.
Postdoctoral Data Scientist
August 2019 - June 2020
University of Maryland and The Organic Center
- Established end-to-end data science experiment impacting US$ 90 billion-dollar agriculture sector and improved yields by 10%.
- Defined new data collection opportunities to create novel database, implemented multivariate machine learning regression model in R, and published widely cited technical document.
- Spoke to over 220 business leaders with invited panel discussion about importance of carbon sequestration research and how our recommendations can impact future agricultural practices.
Data Science Intern
June 2018 - August 2018
DataONE
- Planned and oversaw research team adopting GitHub as primary platform for collaboration on software.
- Wrote 2 technical documents and created 3 data visualizations to share recommendation about data curation and sharing practices.
- Prioritized tasks during fast-paced summer internship and analyzed data sharing practices in top research publications.
PhD Student
September 2014 - June 2019
Rutgers University
- Built maximum entropy ML/AI classification algorithm and leveraged a large publicly available dataset to predict with 95% accuracy a US biosecurity threat using geospatial models.
- Applied machine learning multivariate imputation algorithm to clean and quality check time series data and making 10% more accurate inferences over time.
Program Manager
2011-2014
Providence After School Alliance
- Created data dashboards for executive leadership to monitor program’s KPIs including budget and impact.
- Supervised team of 30 educators through a leadership style that combined setting clear expectations for high performance with patience for more junior employees.
- Trained Providence Public School District educators through 6 sessions to improve their pedagogical skills.
Education
Rutgers University
PhD in Ecology and Evolution
Advisor: Julie Lockwood
2019
Boston University
B.A. in Biology
2008