Selected Publications

(2019). Sequential Dirichlet process mixture of skew t-distributions for model-based clustering of flow cytometry data. Ann. Appl. Stat., 13(1):638-660.

Preprint PDF R package Article

(2019). Gene expression signatures associated with immune and virological responses to therapeutic vaccination with Dendritic Cells in HIV-infected individuals. Front. Immunol., 10:874, 2019.

PDF Article

(2018). cytometree: a binary tree algorithm for automatic gating in cytometry analysis. Cytom. A, 93(11):1132-1140.

Preprint Article R package

(2015). Time-Course Gene Set Analysis for Longitudinal Gene Expression Data. PLoS Comput Biol, 11(6):e1004310.

PDF Code Article R package

Recent Publications

More Publications

(2019). Diet‐Related Metabolites Associated with Cognitive Decline Revealed by Untargeted Metabolomics in a Prospective Cohort. Mol. Nutr. Food Res., in press.

PDF Article

(2019). Benefits of dimension reduction in penalized regression methods for high dimensional grouped data: a case study in low sample size. Bioinformatics, btz135, in press.

Article R code

(2019). Probabilistic record linkage of de-identified research datasets with discrepancies using diagnosis codes. Sci. Data, 6:180298.

PDF R package Article

(2018). Association between anti-citrullinated fibrinogen antibodies and coronary artery disease in rheumatoid arthritis. Arthritis Care Res, 70(7):1113-1117.


(2018). PheProb: probabilistic phenotyping using diagnosis codes to improve power for genetic association studies. JAMIA, 25(10):1359-1365.


Recent Posts

More Posts

I recently updated my set-up, and because I use a High-Performance cluster from my University (kudos to avakas) to run various simulations and analyses, I have MPI and Rmpi installed on my laptop in order to test my scripts before submitting them to the big cluster. So I installed openmpi from homebrew very easily: brew update brew install open-mpi But then I had extensive trouble installing the Rmpi package…


I just released a new package on CRAN. It’s called NPflow, it performs Dirichlet process mixture of multivariate normal, skew-normal or skew t-distributions modeling, you should check it out. I was a little worried because the check from Travis CI was returning a NOTE. And even though the NOTEs seem like mild problems, “you should strive to eliminate all NOTEs” before submitting to CRAN ! Preparing for an email exchange with a member of the R core team, I wrote the following in the submission comments:


After a bumpy road, along which I kept in mind Jeff Leak’s own worst (recent) experience, we finally got our article on Time-Course Gene Set Analysis for Longitudinal Gene Expression Data published in PLoS Computational Biology, a very nice journal ! I am really happy about it, don’t hesitate to check it out ! And there is the TcGSA R package that goes with it.


Useful tips