Posts

Testing Feature Significance with the Likelihood Ratio Test

Talk on Productionizing ML Models

Random Forests vs F_{ST} for Insect Population Genetics

Logistic Regression Ensembles vs F_{ST} for Insect Population Genetics

Cramer's V vs F_{ST} for Insect Population Genetics

Classifying Graphs with Shortest Paths

Variable Selection with Logistic Regression Ensembles

Testing CLI Apps with Bats

Running OpenMM in Docker on Debian

Symplectic Integrators Bound Energy Error

Leapfrog is Symplectic for the Harmonic Oscillator

Verifying Global Error of the Leapfrog Integrator

Verifying Local Error of the Leapfrog Integrator

Deriving the Leapfrog Integrator

Harmonic Oscillator

Popularity Drives Ratings in the MovieLens Datasets

Recommendation System Using KNearest Neighbors

Recommendation System Using Logistic Regression and the Hashing Trick

Evaluating Feature Hashing on Spam Classification

Evaluating Regularization and Optimization Algorithms for Logistic Regression on Spam Classification

Pitfalls When Working With Imbalanced Data Sets

Optimizing Duplicate Document Detection in Apache Spark

Spark Summit East 2016

Imputing Missing Data and Random Forest Variable Importance Scores

Customer Segmentation Pipeline Prototype

Apache: Big Data Europe 2015

Introducing BigTop Data Generators

Feature Correlation and Feature Importance Bias with Random Forests

Categorical Variable Encoding and Feature Importance Bias with Random Forests

Clatern: An UpandComing ML Library for Clojure

Customer Segmentation with the BigPetStore Multinomial Model

BigPetStore Product Generator

BigPetStore Product Purchasing Multinomial Model

BigPetStore Product Purchasing Markov Model

Gotchas with Scala Mutable Collections and Large Data Sets

Electrostatic Cutoffs in Molecular Dynamics

Zen and the Art of Software Development

The NoséHooverLangevin Thermostat

The LennardJones Potential Can Cause Simulations to Blow Up

Statically Compiling C/C++ Programs on CentOS 7

Annual Conference of the Great Lakes Section of SIAM

Resolving Atheros Wifi Weirdness on CentOS 7

Provisioning Gluster Servers with Ansible

Configuring a MultiUser Spark Cluster with Mesos and Gluster

Installing OCaml on CentOS

BigTop Bazaar: Simulating a Bazaar

Random Number Generation Horror Stories

Langevin Equation

OpenSource Spark Notebooks

Hello World!
subscribe via RSS