BACKGROUND
- Harvard & MIT trained Statistician and Machine Learning (ML) Scientist with particular expertise in Quantitative Finance, Natural Language Processing (NLP), and Causal Inference/Causal AI, with 10+ years experience.
- Currently Principal Applied Data Scientist at The Cambridge Group (TCG) leading end-to-end (inception to production) DL and Causal Inference projects for Fortune 100 clients in tech, retail, and e-commerce. Clients include:
- Levi Strauss & Co
- Shutterfly LLC
- Coca-Cola Company
- Teaching @ Stanford including XCS234: Reinforcement Learning, XCS224W: ML over Graphs & Networks, and XCS221: Artificial Intelligence
- Former Chief Data Scientist in the Quantitative Finance and Venture Capital/Private Equity spaces (AIMatters & BuildGroup), leading design and building of Deep Learning, Natural Language Processing (NLP), and Reinforcement Learning systems for identification of investment opportunities & ML-built ETF (Exchange Traded Fund)
- Former Research Fellow at Harvard University, and former Adjunct Faculty in Statistics at MCPHS University
- Trained in state-of-the-art Causal Inference techniques by pioneering Harvard faculty with expertise in G-Methods (i.e. G-formula, G-Estimation, etc), Doubly-Robust Estimation, Causal Discovery, Targeted Maximum Likelihood Estimation (TMLE), Double Machine Learning
My research interests lie at the intersection of Quantitative Finance, NLP, Experimentation, Causal Inference, and Deep Learning. I’m trained in state-of-the-art Causal Inference techniques by pioneering Harvard faculty, and am a proponent of G-Methods (i.e. G-formula, G-Estimation, etc) and Doubly-Robust Estimation techniques. I have deep interests in specialized techniques for leveraging non-Donsker class ML estimators for Causal Inference, including Targeted Maximum Likelihood Estimation (TMLE) and Double Machine Learning (DML). I’m also interested in the connections of these topics to more “traditional” areas of Deep Learning, including Natural Language Processing, Reinforcement Learning, and Knowledge Discovery over Probabilistic Graphical Models.
MACHINE LEARNING / DEEP LEARNING EXPERTISE:
- Deep Learning (ConvNet, RNN, LSTM, Transformer, etc)
- “Traditional” Machine Learning (Random Forests, Gradient Boosting, SVMs, Stacked Ensembles, etc)
- Natural Language Processing (NLP)
- Computer Vision
- Reinforcement Learning
- Generative Learning
- Probabilistic Graphical Models
- Graphical/Network Machine Learning
- Recommender Systems
- Interpretable AI
- ML + Causal Inference (TMLE, Double Machine Learning)
STATISTICS EXPERTISE:
- Mathematical Statistics
- Stochastic Processes
- Statistical Learning Theory
- Bayesian Inference (Parametric & Nonparametric)
- Survival Methods
- Advanced Study Designs (Observation, Case-Control, etc)
- Experimentation (A/B testing, Multi-Armed Bandits, Adaptive Trial Design, etc)
- Causal Inference Methods (G-methods, Propensity Score methods, IV estimators, etc)
SOFTWARE/PROGRAMMING STACK:
- Scientific Computing: Python (NumPy, Pandas, Scikit-learn, Matplotlib, etc), R, C++, MATLAB, STATA, SAS
- DL frameworks & Optimizers: PyTorch, TensorFlow, Optuna, Hyperopt
- Distributed Computing: PySpark, MLlib (exposure to native Spark w/ Scala, & Hadoop)
- SQL: MySQL, Microsoft SQL Server
- Production: Docker, Flask, Airflow, MLflow, Git, CircleCI
- Cloud: Amazon AWS (exposure to Microsoft Azure and Google GCP)