I’ve been involved in various research and open source projects. I’ve made list here.

Big Data / Distributed Systems

Apache Spark

Apache Spark is a next generation distributed compute engine with a multimodal (data pipeline and SQL) programming model. Spark is particularly well suited to data cleaning, processing, and analytics.

BigPetStore

BigPetStore is an ecosystem of data generation and example applications focused on transaction data for a fictional chain of pet stores. Most of BigPetStore’s projects are hosted in Apache BigTop, of which I’m a committer.

Bioinformatics

Computational Physics

I spent the first four years of my Ph.D. working on new models and numerical methods for molecular dynamics.