top of page

Projects

In addition to my professional work as a data scientist, I also like to perform side projects when I find the time. Below you will find links to a few data science projects I have done on publicly available datasets.

tSNE_animation_final.gif

The goal of this work was to build a tool for game developers to help them understand the segmentation of the PC gaming market, and predict which users are most likely to purchase games they are currently developing. If implemented early in game development stages, developers could use this tool to tweak the characteristics of their game to either optimize the engagement of their target users, or maximize revenue through increased sales.

Click here for a full write up!

With increasing vehicle congestion on NYC streets, many ride services (Uber, Lyft, etc) are turning to the  "ride sharing" model as a way to reduce the number of vehicles on the road, while still servicing a large number of people with minimal inconvenience. However, it remains unclear what proportion of rides on any given day are viable candidates for sharing with other users. In this project, I use publicly available NYC taxi data and a customized graph theoretical analysis of NYC street networks to answer this question. I found that certain routes are indeed more viable ride sharing candidates, such as trips to and from either JFK or La Guardia airports, while short trips within neighborhoods typically require more route detours for customers. 

Click here for a full write up!

NYC_street_1.png

Many real world data science problems boil down to trying to group a number of noisy observations into clusters based on their similarity.

In this notebook, I walk through the math and code of an approach that I have found to be quite effective for solving this problem.

noisy_dist.png
bottom of page