Within my work on insect vector population genetics, we often want to infer regions of the chromosomes that are undergoing differentiation. One way in which we do this is to look for windows with more than expected number of statistically-significant SNPs.

To set up the test, we first need to perform association tests on each individual SNP using something like the likelihood-ratio test or \(F_{ST}\) to identify SNPs that are strongly correlated with the population structure or phenotype of interest. We then divide the chromosome into non-overlapping windows and count the number of SNPs in each window. Lastly, we perform a statistical test on each window, with a null hypothesis that the SNPs are uniformly distributed across the windows.

Neafsey, et al. performed this analysis using the popular \(\chi_2\) test. I prefer using the one-tailed binomial test, however, as it’s more sensitive. Conveniently, the binomial test is available in Scipy.

My script for performing this analysis is available below: