Data Scientist

Top Skills: Python, SQL, HDFS, AWS, GCP, Big Query, Machine Learning, Data Wrangling, Data Storytelling, Product Management

Experience

Data Architect @ Apple

May 2024 - Present

  • Collaborate with cross-functional teams, including software engineers, product managers, and data scientists, to define and develop specifications for data analytics pipelines supporting a rapidly expanding portfolio of hundreds of data products, enabling the measurement of user engagement for new features across Apple Music, Apple TV, App Store, Apple Fitness+, and Podcasts apps.
  • Analyze and validate data pipelines for app UI feature use cases, utilizing Splunk and Hive to query and process datasets containing billions of records on HDFS.
  • Streamline the delivery of high-quality data products across five business lines by leveraging tools like Quip, Slack, Github, and JIRA, ensuring seamless collaboration and adherence to project timelines.

Data Scientist @ Meta

October 2023 – May 2024

  • Developed over 10 interactive dashboards using Unidash (Looker) and Daiquery (SQL) to write, visualize and analyze quality benchmarking metrics for a state-of-the-art, 3D graphics pipeline at the intersection of VR and AI computer vision. Link to video: Codec Avatars
  • Authored, tested, scheduled, and maintained dozens of data pipelines to ingest, transform, and load intermediate tables with billions rows of data for dashboard analytics using Python, Jupyter Notebooks, Hive, Presto SQL, Spark, and dataswarm (Airflow).
  • Collaborated with a team of software engineers, AI researchers, and data engineers to architect and maintain analytics pipelines and dashboards informing 50 monthly active users of real-time trends in the codec avatar pipeline.

Data Analytics Instructor @ Edx

October 2022 – October 2023

  • Delivered comprehensive instruction to a class of 17 students across diverse subject areas, including Excel, Python, SQL, AWS, GCP, Big Query, Databricks, Spark, Tableau, statistics, and machine learning, consistently achieving an impressive weekly satisfaction rate of 90%.
  • Served as a mentor to four teams, fostering their analytical thinking skills and guiding the technical implementation of their exploratory data analysis reports.
  • Oversaw the technical upkeep of a substantial 1.5 GB repository in Python on Github.com and Gitlab, a strategic move that effectively conserved approximately 5 hours of weekly code maintenance time.

Data Scientist @ T. Rowe Price

July 2020 – October 2022

  • Engineered a machine learning pipeline using Python and SQL on Snowflake, enhancing time-series forecast accuracy for a large-scale contact center with 1,000+ employees and nearly 3 million annual calls. The newly implemented model achieved a significant 50% reduction in the mean absolute percentage error (MAPE) of long-term forecasting, translating to an annual saving of over $1.6M for workforce management.
  • Led the development of data analysis reports, Tableau and PowerBI dashboards that connected to Snowflake, and Python notebooks, providing critical analytics insight to an organization comprising over 200 managers and executives. This effort formed the backbone of the organization’s data-driven approach, influencing and driving 80% of all analytical insight and strategic decision-making.
  • Successfully mentored and managed a new data analyst, overseeing their development of 5+ key projects using Microsoft Azure, SQL, Python, and Excel which increased team efficiency by 20%, while fostering their professional growth which led to a 30% improvement in their data analysis proficiency within a six-month period.

Data Science Instructor @ Thinkful

December 2017 – April 2020

  • Directed numerous recruiting workshops, achieving a 33% sign-up rate conversion and conducted over 100 data science code reviews, offering indispensable feedback to students from prominent companies like HBO and Yelp.
  • Rendered expert mentorship to a student on feature engineering, utilizing real estate data to forecast political preferences based on physical attributes such as building materials, square footage, and garage count.
  • Assessed Data Science Bootcamp students during mock interviews, probing knowledge in Python, SQL, data analysis, experimental design, supervised and unsupervised machine learning, along with behavioral competency.

Machine Learning Instructor @ Google

April 2019 - August 2019

  • Partnered with an interdisciplinary group of Google Engineers, Managers, and University Professors to develop and deliver rich educational content covering Statistics, Neural Networks, and Computer Vision. This pivotal role included detailed planning and lecture development for Google’s 10-week Applied Machine Learning Intensive Summer 2019 cohort, contributing to a 30% increase in student comprehension and hands-on proficiency.
  • Actively provided mentorship and executed meticulous code reviews for four distinct teams that successfully deployed deep learning applications. One standout project involved the use of PyTorch to train and deploy a robust image classification model, capable of accurately distinguishing between recyclables and non-recyclables, thereby improving model precision by 25%.

Education

M.S. Data Science - Southern Methodist University

August 2017

B.S. Engineering - University of Massachusetts, Amherst

May 2009