Calculates the median graph connection degree over a 60 second sliding window, in the large, evolving social graph produced through person to person payments on Venmo. C++, Boost, O(log n) algorithms, highly performant, big data on one small machine.
Machine learning project to classify a workout movement as correct, or one of many discrete ways it could deviate, from data based on worn-sensors (accelerometers, positional). The random forest model is fast and over 99% accurate at classifying both test and validation sets. Machine learning, random forests, classification, cross-validation. Source.
This report analyzes the effects of Vitamin C on tooth growth in guinea pigs, by dosage and dosing method. Source.
This report explores the Central Limit Theorem (CLT) by repeatedly sampling from a non-normal distribution and analyzing the the means of the samples. Source.
Exploratory analysis of the NOAA Storm Database, to identify the costliest and most harmful extreme weather events of the past 65 years. Source.
An R project that explores Bay Area data in search of where we want to live. Data: crime, home prices, and public transit times. Source.
A proof-of-concept web dashboard for a digital interior lighting control system. Source.
A handful of data visualization demos in D3, from the CSU Long Beach Common Data Set. Source.